40 resultados para Data Pre-processing
Resumo:
The general aim of the thesis was to study university students’ learning from the perspective of regulation of learning and text processing. The data were collected from the two academic disciplines of medical and teacher education, which share the features of highly scheduled study, a multidisciplinary character, a complex relationship between theory and practice and a professional nature. Contemporary information society poses new challenges for learning, as it is not possible to learn all the information needed in a profession during a study programme. Therefore, it is increasingly important to learn how to think and learn independently, how to recognise gaps in and update one’s knowledge and how to deal with the huge amount of constantly changing information. In other words, it is critical to regulate one’s learning and to process text effectively. The thesis comprises five sub-studies that employed cross-sectional, longitudinal and experimental designs and multiple methods, from surveys to eye tracking. Study I examined the connections between students’ study orientations and the ways they regulate their learning. In total, 410 second-, fourth- and sixth-year medical students from two Finnish medical schools participated in the study by completing a questionnaire measuring both general study orientations and regulation strategies. The students were generally deeply oriented towards their studies. However, they regulated their studying externally. Several interesting and theoretically reasonable connections between the variables were found. For instance, self-regulation was positively correlated with deep orientation and achievement orientation and was negatively correlated with non-commitment. However, external regulation was likewise positively correlated with deep orientation and achievement orientation but also with surface orientation and systematic orientation. It is argued that external regulation might function as an effective coping strategy in the cognitively loaded medical curriculum. Study II focused on medical students’ regulation of learning and their conceptions of the learning environment in an innovative medical course where traditional lectures were combined wth problem-based learning (PBL) group work. First-year medical and dental students (N = 153) completed a questionnaire assessing their regulation strategies of learning and views about the PBL group work. The results indicated that external regulation and self-regulation of the learning content were the most typical regulation strategies among the participants. In line with previous studies, self-regulation wasconnected with study success. Strictly organised PBL sessions were not considered as useful as lectures, although the students’ views of the teacher/tutor and the group were mainly positive. Therefore, developers of teaching methods are challenged to think of new solutions that facilitate reflection of one’s learning and that improve the development of self-regulation. In Study III, a person-centred approach to studying regulation strategies was employed, in contrast to the traditional variable-centred approach used in Study I and Study II. The aim of Study III was to identify different regulation strategy profiles among medical students (N = 162) across time and to examine to what extent these profiles predict study success in preclinical studies. Four regulation strategy profiles were identified, and connections with study success were found. Students with the lowest self-regulation and with an increasing lack of regulation performed worse than the other groups. As the person-centred approach enables us to individualise students with diverse regulation patterns, it could be used in supporting student learning and in facilitating the early diagnosis of learning difficulties. In Study IV, 91 student teachers participated in a pre-test/post-test design where they answered open-ended questions about a complex science concept both before and after reading either a traditional, expository science text or a refutational text that prompted the reader to change his/her beliefs according to scientific beliefs about the phenomenon. The student teachers completed a questionnaire concerning their regulation and processing strategies. The results showed that the students’ understanding improved after text reading intervention and that refutational text promoted understanding better than the traditional text. Additionally, regulation and processing strategies were found to be connected with understanding the science phenomenon. A weak trend showed that weaker learners would benefit more from the refutational text. It seems that learners with effective learning strategies are able to pick out the relevant content regardless of the text type, whereas weaker learners might benefit from refutational parts that contrast the most typical misconceptions with scientific views. The purpose of Study V was to use eye tracking to determine how third-year medical studets (n = 39) and internal medicine residents (n = 13) read and solve patient case texts. The results revealed differences between medical students and residents in processing patient case texts; compared to the students, the residents were more accurate in their diagnoses and processed the texts significantly faster and with a lower number of fixations. Different reading patterns were also found. The observed differences between medical students and residents in processing patient case texts could be used in medical education to model expert reasoning and to teach how a good medical text should be constructed. The main findings of the thesis indicate that even among very selected student populations, such as high-achieving medical students or student teachers, there seems to be a lot of variation in regulation strategies of learning and text processing. As these learning strategies are related to successful studying, students enter educational programmes with rather different chances of managing and achieving success. Further, the ways of engaging in learning seldom centre on a single strategy or approach; rather, students seem to combine several strategies to a certain degree. Sometimes, it can be a matter of perspective of which way of learning can be considered best; therefore, the reality of studying in higher education is often more complicated than the simplistic view of self-regulation as a good quality and external regulation as a harmful quality. The beginning of university studies may be stressful for many, as the gap between high school and university studies is huge and those strategies that were adequate during high school might not work as well in higher education. Therefore, it is important to map students’ learning strategies and to encourage them to engage in using high-quality learning strategies from the beginning. Instead of separate courses on learning skills, the integration of these skills into course contents should be considered. Furthermore, learning complex scientific phenomena could be facilitated by paying attention to high-quality learning materials and texts and other support from the learning environment also in the university. Eye tracking seems to have great potential in evaluating performance and growing diagnostic expertise in text processing, although more research using texts as stimulus is needed. Both medical and teacher education programmes and the professions themselves are challenging in terms of their multidisciplinary nature and increasing amounts of information and therefore require good lifelong learning skills during the study period and later in work life.
Resumo:
The main objective of this study was todo a statistical analysis of ecological type from optical satellite data, using Tipping's sparse Bayesian algorithm. This thesis uses "the Relevence Vector Machine" algorithm in ecological classification betweenforestland and wetland. Further this bi-classification technique was used to do classification of many other different species of trees and produces hierarchical classification of entire subclasses given as a target class. Also, we carried out an attempt to use airborne image of same forest area. Combining it with image analysis, using different image processing operation, we tried to extract good features and later used them to perform classification of forestland and wetland.
Resumo:
Operatiivisen tiedon tuottaminen loppukäyttäjille analyyttistä tarkastelua silmällä pitäen aiheuttaa ongelmia useille yrityksille. Diplomityö pyrkii ratkaisemaan ko. ongelman Teleste Oyj:ssä. Työ on jaettu kolmeen pääkappaleeseen. Kappale 2 selkiyttää On-Line Analytical Processing (OLAP)- käsitteen. Kappale 3 esittelee muutamia OLAP-tuotteiden valmistajia ja heidän arkkitehtuurejaan sekä tyypillisten sovellusalueiden lisäksi huomioon otettavia asioita OLAP käyttöönoton yhteydessä. Kappale 4, tuo esille varsinaisen ratkaisun. Teknisellä arkkitehtuurilla on merkittävä asema ratkaisun rakenteen kannalta. Tässä on sovellettu Microsoft:n tietovarasto kehysrakennetta. Kappaleen 4 edetessä, tapahtumakäsittelytieto muutetaan informaatioksi ja edelleen loppukäyttäjien tiedoksi. Loppukäyttäjät varustetaan tehokkaalla ja tosiaikaisella analysointityökalulla moniulotteisessa ympäristössä. Vaikka kiertonopeus otetaan työssä sovellusesimerkiksi, työ ei pyri löytämään optimaalista tasoa Telesten varastoille. Siitä huolimatta eräitä parannusehdotuksia mainitaan.
Resumo:
The globalization and development of an information society promptly change shape of the modern world. Cities and especially megacities including Saint-Petersburg are in the center of occuring changes. As a result of these changes the economic activities connected to reception and processing of the information now play very important role in economy of megacities what allows to characterize them as "information". Despite of wide experience in decision of information questions Russia, and in particular Saint-Petersburg, lag behind in development of information systems from the advanced European countries. The given master's thesis is devoted to development of an information system (data transmission network) on the basis of wireless technology in territory of Saint-Petersburg region within the framework of FTOP "Electronic Russia" and RTOP "Electronic Saint-Petersburg" programs. Logically the master's thesis can be divided into 3 parts: 1. The problems, purposes, expected results, terms and implementation of the "Electronic Russia" program. 2. Discussion about wireless data transmission networks (description of technology, substantiation of choice, description of signal's transmission techniques and types of network topology). 3. Fulfillment of the network (organization of central network node, regional centers, access lines, description of used equipment, network's capabilities), financial provision of the project, possible network management models.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Data traffic caused by mobile advertising client software when it is communicating with the network server can be a pain point for many application developers who are considering advertising-funded application distribution, since the cost of the data transfer might scare their users away from using the applications. For the thesis project, a simulation environment was built to mimic the real client-server solution for measuring the data transfer over varying types of connections with different usage scenarios. For optimising data transfer, a few general-purpose compressors and XML-specific compressors were tried for compressing the XML data, and a few protocol optimisations were implemented. For optimising the cost, cache usage was improved and pre-loading was enhanced to use free connections to load the data. The data traffic structure and the various optimisations were analysed, and it was found that the cache usage and pre-loading should be enhanced and that the protocol should be changed, with report aggregation and compression using WBXML or gzip.
Resumo:
The objectives of this research work “Identification of the Emerging Issues in Recycled Fiber processing” are discovering of emerging research issues and presenting of new approaches to identify promising research themes in recovered paper application and production. The projected approach consists of identifying technological problems often encountered in wastepaper preparation processes and also improving the quality of recovered paper and increasing its proportion in the composition of paper and board. The source of information for the problem retrieval is scientific publications in which waste paper application and production were discussed. The study has exploited several research methods to understand the changes related to utilization of recovered paper. The all assembled data was carefully studied and categorized by applying software called RefViz and CiteSpace. Suggestions were made on the various classes of these problems that need further investigation in order to propose an emerging research trends in recovered paper.
Resumo:
The ability of the supplier firm to generate and utilise customer-specific knowledge has attracted increasing attention in the academic literature during the last decade. It has been argued the customer knowledge should treated as a strategic asset the same as any other intangible assets. Yet, at the same time it has been shown that the management of customer-specific knowledge is challenging in practice, and that many firms are better at acquiring customer knowledge than at making use of it. This study examines customer knowledge processing in the context of key account management in large industrial firms. This focus was chosen because key accounts are demanding and complex. It is not unusual for a single key account relationship to constitute a complex web of relationships between the supplier and the key account – thus easily leading to the dispersion of customer-specific knowledge in the supplier firm. Although the importance of customer-specific knowledge generation has been widely acknowledged in the literature, surprisingly little attention has been paid to the processes through which firms generate, disseminate and use such knowledge internally for enhancing the relationships with their major, strategically important key account customers. This thesis consists of two parts. The first part comprises a theoretical overview and draws together the main findings of the study, whereas the second part consists of five complementary empirical research papers based on survey data gathered from large industrial firms in Finland. The findings suggest that the management of customer knowledge generated about and form key accounts is a three-dimensional process consisting of acquisition, dissemination and utilization. It could be concluded from the results that customer-specific knowledge is a strategic asset because the supplier’s customer knowledge processing activities have a positive effect on supplier’s key account performance. Moreover, in examining the determinants of each phase separately the study identifies a number of intra-organisational factors that facilitate the process in supplier firms. The main contribution of the thesis lies in linking the concept of customer knowledge processing to the previous literature on key account management. Moreover, given than this literature is mainly conceptual or case-based, a further contribution is to examine its consequences and determinants based on quantitative empirical data.
Resumo:
Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) are some of the mathematical pre- liminaries that are discussed prior to explaining PLS and PCR models. Both PLS and PCR are applied to real spectral data and their di erences and similarities are discussed in this thesis. The challenge lies in establishing the optimum number of components to be included in either of the models but this has been overcome by using various diagnostic tools suggested in this thesis. Correspondence analysis (CA) and PLS were applied to ecological data. The idea of CA was to correlate the macrophytes species and lakes. The di erences between PLS model for ecological data and PLS for spectral data are noted and explained in this thesis. i
Resumo:
The importance of efficient supply chain management has increased due to globalization and the blurring of organizational boundaries. Various supply chain management technologies have been identified to drive organizational profitability and financial performance. Organizations have historically been concentrating heavily on the flow of goods and services, while less attention has been dedicated to the flow of money. While supply chains are becoming more transparent and automated, new opportunities for financial supply chain management have emerged through information technology solutions and comprehensive financial supply chain management strategies. This research concentrates on the end part of the purchasing process which is the handling of invoices. Efficient invoice processing can have an impact on organizations working capital management and thus provide companies with better readiness to face the challenges related to cash management. Leveraging a process mining solution the aim of this research was to examine the automated invoice handling process of four different organizations. The invoice data was collected from each organizations invoice processing system. The sample included all the invoices organizations had processed during the year 2012. The main objective was to find out whether e-invoices are faster to process in an automated invoice processing solution than scanned invoices (post entry into invoice processing solution). Other objectives included looking into the longest lead times between process steps and the impact of manual process steps on cycle time. Processing of invoices from maverick purchases was also examined. Based on the results of the research and previous literature on the subject, suggestions for improving the process were proposed. The results of the research indicate that scanned invoices were processed faster than e-invoices. This is mostly due to the more complex processing of e-invoices. It should be noted however that the manual tasks related to turning a paper invoice into electronic format through scanning are ignored in this research. The transitions with the longest lead times in the invoice handling process included both pre-automated steps as well as manual steps performed by humans. When the most common manual steps were examined in more detail, it was clear that these steps had a prolonging impact on the process. Regarding invoices from maverick purchases the evidence shows that these invoices were slower to process than invoices from purchases conducted through e-procurement systems and from preferred suppliers. Suggestions on how to improve the process included: increasing invoice matching, reducing of manual steps and leveraging of different value added services such as invoice validation service, mobile solutions and supply chain financing services. For companies that have already reaped all the process efficiencies the next step is to engage in collaborative financial supply chain management strategies that can benefit the whole supply chain.
Resumo:
Although the concept of multi-products biorefinery provides an opportunity to meet the future demands for biofuels, biomaterials or chemicals, it is not assured that its implementation would improve the profitability of kraft pulp mills. The attractiveness will depend on several factors such as mill age and location, government incentives, economy of scale, end user requirements, and how much value can be added to the new products. In addition, the effective integration of alternative technologies is not straightforward and has to be carefully studied. In this work, detailed balances were performed to evaluate possible impacts that lignin removal, hemicelluloses recovery prior to pulping, torrefaction and pyrolysis of wood residues cause on the conventional mill operation. The development of mill balances was based on theoretical fundamentals, practical experience, literature review, personal communication with technology suppliers and analysis of mill process data. Hemicelluloses recovery through pre-hydrolysis of chips leads to impacts in several stages of the kraft process. Effects can be observed on the pulping process, wood consumption, black liquor properties and, inevitably, on the pulp quality. When lignin is removed from black liquor, it will affect mostly the chemical recovery operation and steam generation rate. Since mineral acid is used to precipitate the lignin, impacts on the mill chemical balance are also expected. A great advantage of processing the wood residues for additional income results from the fact that the pulping process, pulp quality and sales are not harmfully affected. For pulp mills interested in implementing the concept of multi-products biorefinery, this work has indicated possible impacts to be considered in a technical feasibility study.
Resumo:
Because of the increased availability of different kind of business intelligence technologies and tools it can be easy to fall in illusion that new technologies will automatically solve the problems of data management and reporting of the company. The management is not only about management of technology but also the management of processes and people. This thesis is focusing more into traditional data management and performance management of production processes which both can be seen as a requirement for long lasting development. Also some of the operative BI solutions are considered in the ideal state of reporting system. The objectives of this study are to examine what requirements effective performance management of production processes have for data management and reporting of the company and to see how they are effecting on the efficiency of it. The research is executed as a theoretical literary research about the subjects and as a qualitative case study about reporting development project of Finnsugar Ltd. The case study is examined through theoretical frameworks and by the active participant observation. To get a better picture about the ideal state of reporting system simple investment calculations are performed. According to the results of the research, requirements for effective performance management of production processes are automation in the collection of data, integration of operative databases, usage of efficient data management technologies like ETL (Extract, Transform, Load) processes, data warehouse (DW) and Online Analytical Processing (OLAP) and efficient management of processes, data and roles.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014