775 resultados para Data Mining, Rough Sets, Multi-Dimension, Association Rules, Constraint
Resumo:
In recent decades, business intelligence (BI) has gained momentum in real-world practice. At the same time, business intelligence has evolved as an important research subject of Information Systems (IS) within the decision support domain. Today’s growing competitive pressure in business has led to increased needs for real-time analytics, i.e., so called real-time BI or operational BI. This is especially true with respect to the electricity production, transmission, distribution, and retail business since the law of physics determines that electricity as a commodity is nearly impossible to be stored economically, and therefore demand-supply needs to be constantly in balance. The current power sector is subject to complex changes, innovation opportunities, and technical and regulatory constraints. These range from low carbon transition, renewable energy sources (RES) development, market design to new technologies (e.g., smart metering, smart grids, electric vehicles, etc.), and new independent power producers (e.g., commercial buildings or households with rooftop solar panel installments, a.k.a. Distributed Generation). Among them, the ongoing deployment of Advanced Metering Infrastructure (AMI) has profound impacts on the electricity retail market. From the view point of BI research, the AMI is enabling real-time or near real-time analytics in the electricity retail business. Following Design Science Research (DSR) paradigm in the IS field, this research presents four aspects of BI for efficient pricing in a competitive electricity retail market: (i) visual data-mining based descriptive analytics, namely electricity consumption profiling, for pricing decision-making support; (ii) real-time BI enterprise architecture for enhancing management’s capacity on real-time decision-making; (iii) prescriptive analytics through agent-based modeling for price-responsive demand simulation; (iv) visual data-mining application for electricity distribution benchmarking. Even though this study is from the perspective of the European electricity industry, particularly focused on Finland and Estonia, the BI approaches investigated can: (i) provide managerial implications to support the utility’s pricing decision-making; (ii) add empirical knowledge to the landscape of BI research; (iii) be transferred to a wide body of practice in the power sector and BI research community.
Resumo:
Our aim was to observe the induction of panic attacks by a hyperventilation challenge test in panic disorder patients (DSM-IV) and their healthy first-degree relatives. We randomly selected 25 panic disorder patients, 31 healthy first-degree relatives of probands with panic disorder and 26 normal volunteers with no family history of panic disorder. All patients had no psychotropic drugs for at least one week. They were induced to hyperventilate (30 breaths/min) for 4 min and anxiety scales were applied before and after the test. A total of 44.0% (N = 11) panic disorder patients, 16.1% (N = 5) of first-degree relatives and 11.5% (N = 3) of control subjects had a panic attack after hyperventilating (chi² = 8.93, d.f. = 2, P = 0.011). In this challenge test the panic disorder patients were more sensitive to hyperventilation than first-degree relatives and normal volunteers. Although the hyperventilation test has a low sensitivity, our data suggest that there is no association between a family history of panic disorder and hyperreactivity to an acute hyperventilation challenge test. Perhaps cognitive variables should be considered to play a specific role in this association since symptoms of a panic attack and acute hyperventilation overlap.
Resumo:
The relationship between anxiety-related behaviors and voluntary ethanol intake was examined in two pairs of rat lines by the oral ethanol self-administration procedure. Floripa high (H) and low (L) rats selectively bred for contrasting anxiety responses in the open-field test, and two inbred strains, spontaneously hypertensive rats (SHR) and Lewis rats which are known to differ significantly when submitted to several behavioral tests of anxiety/emotionality, were used (9-10 animals/line/sex). No differences in the choice of ethanol solutions (2%, days 1-4, and 4%, days 5-8, respectively) in a 2-bottle paradigm were detected between Floripa H and L rats (1.94 ± 0.37 vs 1.61 ± 0.37 g/kg for ethanol intake on day 8 by the Floripa H and L rat lines, respectively). Contrary to expectations, the less anxious SHR rats consumed significantly more ethanol than Lewis rats (respective intake of 2.30 ± 0.45 and 0.72 ± 0.33 g/kg on day 8) which are known to be both addiction-prone and highly anxious. Regardless of strain, female rats consumed more ethanol than males (approximately 46%). The results showed no relationship between high anxiety and voluntary intake of ethanol for Floripa H and L rats. A negative association between these two variables, however, was found for SHR and Lewis rat strains. Data from the literature regarding the association between anxiety and alcohol intake in animal models are not conclusive, but the present results indicate that factors other than increased inborn anxiety probably lead to the individual differences in ethanol drinking behavior.
Resumo:
CYP1A1 and GSTP1 polymorphisms have been associated with a higher risk to develop several cancers, including oral squamous cell carcinoma (OSCC), which is closely related to tobacco and alcohol consumption. Both genes code for enzymes that have an important role in activating or detoxifying carcinogenic elements found in tobacco and other compounds, and polymorphic variants of these genes may result in alterations of the enzymatic activity. The CYP1A1 gene codes for the enzyme aryl hydrocarbon hydroxylase, which is responsible for the metabolism of polycyclic aromatic hydrocarbons. The investigated polymorphism, Ile/Val, seems to increase the activity of the enzyme in homozygous individuals, leading to an accumulation of carcinogens. The Ile/Val polymorphism occurs because of an A->G transition at exon 7, resulting in the CYP1A1*2B allele. The GSTP1*B variant shows an A->G transition at exon 5, changing the amino acid Ile to Val, with a reduced catalytic activity of the enzyme. Due to this reduction, the carriers of mutant alleles lost the capability to metabolize carcinogens, which could be responsible for a higher susceptibility to cancer. We conducted a case-control study in a group of 72 cases with newly diagnosed OSCC and 60 healthy controls matched for age, gender, smoking habits, and ethnicity. We used PCR methods to identify the allelic variants CYP1A1*2B and GSTP1*B. The data obtained showed no statistically significant association of allelic or genotypic variants of CYP1A1*2B (OR = 1.06; 95% CI = 0.49-2.29) and GSTP1*B (OR = 1.40; 95% CI = 0.70-2.79) with OSCC.
Resumo:
Companies require information in order to gain an improved understanding of their customers. Data concerning customers, their interests and behavior are collected through different loyalty programs. The amount of data stored in company data bases has increased exponentially over the years and become difficult to handle. This research area is the subject of much current interest, not only in academia but also in practice, as is shown by several magazines and blogs that are covering topics on how to get to know your customers, Big Data, information visualization, and data warehousing. In this Ph.D. thesis, the Self-Organizing Map and two extensions of it – the Weighted Self-Organizing Map (WSOM) and the Self-Organizing Time Map (SOTM) – are used as data mining methods for extracting information from large amounts of customer data. The thesis focuses on how data mining methods can be used to model and analyze customer data in order to gain an overview of the customer base, as well as, for analyzing niche-markets. The thesis uses real world customer data to create models for customer profiling. Evaluation of the built models is performed by CRM experts from the retailing industry. The experts considered the information gained with help of the models to be valuable and useful for decision making and for making strategic planning for the future.
Resumo:
Presentation of Kristiina Hormia-Poutanen at the 25th Anniversary Conference of The National Repository Library of Finland, Kuopio 22th of May 2015.
Resumo:
The functional effect of the A>G transition at position 2756 on the MTR gene (5-methyltetrahydrofolate-homocysteine methyltransferase), involved in folate metabolism, may be a risk factor for head and neck squamous cell carcinoma (HNSCC). The frequency of MTR A2756G (rs1805087) polymorphism was compared between HNSCC patients and individuals without history of neoplasias. The association of this polymorphism with clinical histopathological parameters was evaluated. A total of 705 individuals were included in the study. The polymerase chain reaction-restriction fragment length polymorphism technique was used to genotype the polymorphism. For statistical analysis, the chi-square test (univariate analysis) was used for comparisons between groups and multiple logistic regression (multivariate analysis) was used for interactions between the polymorphism and risk factors and clinical histopathological parameters. Using univariate analysis, the results did not show significant differences in allelic or genotypic distributions. Multivariable analysis showed that tobacco and alcohol consumption (P < 0.05), AG genotype (P = 0.019) and G allele (P = 0.028) may be predictors of the disease and a higher frequency of the G polymorphic allele was detected in men with HNSCC compared to male controls (P = 0.008). The analysis of polymorphism regarding clinical histopathological parameters did not show any association with the primary site, aggressiveness, lymph node involvement or extension of the tumor. In conclusion, our data provide evidence that supports an association between the polymorphism and the risk of HNSCC.
Resumo:
Aineistojen käsittely ja jalostaminen. Esitys Liikearkistopäiville 2015.
Resumo:
Leveraging cloud services, companies and organizations can significantly improve their efficiency, as well as building novel business opportunities. Cloud computing offers various advantages to companies while having some risks for them too. Advantages offered by service providers are mostly about efficiency and reliability while risks of cloud computing are mostly about security problems. Problems with security of the cloud still demand significant attention in order to tackle the potential problems. Security problems in the cloud as security problems in any area of computing, can not be fully tackled. However creating novel and new solutions can be used by service providers to mitigate the potential threats to a large extent. Looking at the security problem from a very high perspective, there are two focus directions. Security problems that threaten service user’s security and privacy are at one side. On the other hand, security problems that threaten service provider’s security and privacy are on the other side. Both kinds of threats should mostly be detected and mitigated by service providers. Looking a bit closer to the problem, mitigating security problems that target providers can protect both service provider and the user. However, the focus of research community mostly is to provide solutions to protect cloud users. A significant research effort has been put in protecting cloud tenants against external attacks. However, attacks that are originated from elastic, on-demand and legitimate cloud resources should still be considered seriously. The cloud-based botnet or botcloud is one of the prevalent cases of cloud resource misuses. Unfortunately, some of the cloud’s essential characteristics enable criminals to form reliable and low cost botclouds in a short time. In this paper, we present a system that helps to detect distributed infected Virtual Machines (VMs) acting as elements of botclouds. Based on a set of botnet related system level symptoms, our system groups VMs. Grouping VMs helps to separate infected VMs from others and narrows down the target group under inspection. Our system takes advantages of Virtual Machine Introspection (VMI) and data mining techniques.
Resumo:
Kilpailuetua tavoittelevan yrityksen pitää kyetä jalostamaan tietoa ja tunnistamaan sen avulla uusia tulevaisuuden mahdollisuuksia. Tulevaisuuden mielikuvien luomiseksi yrityksen on tunnettava toimintaympäristönsä ja olla herkkänä havaitsemaan muutostrendit ja muut toimintaympäristön signaalit. Ympäristön elintärkeät signaalit liittyvät kilpailijoihin, teknologian kehittymiseen, arvomaailman muutoksiin, globaaleihin väestötrendeihin tai jopa ympäristön muutoksiin. Spatiaaliset suhteet ovat peruspilareita käsitteellistää maailmaamme. Pitney (2015) on arvioinut, että 80 % kaikesta bisnesdatasta sisältää jollakin tavoin viittauksia paikkatietoon. Siitä huolimatta paikkatietoa on vielä huonosti hyödynnetty yritysten strategisten päätösten tukena. Teknologioiden kehittyminen, tiedon nopea siirto ja paikannustekniikoiden integroiminen eri laitteisiin ovat mahdollistaneet sen, että paikkatietoa hyödyntäviä palveluja ja ratkaisuja tullaan yhä enemmän näkemään yrityskentässä. Tutkimuksen tavoitteena oli selvittää voiko location intelligence toimia strategisen päätöksenteon tukena ja jos voi, niin miten. Työ toteutettiin konstruktiivista tutkimusmenetelmää käyttäen, jolla pyritään ratkaisemaan jokin relevantti ongelma. Konstruktiivinen tutkimus tehtiin tiiviissä yhteistyössä kolmen pk-yrityksen kanssa ja siihen haastateltiin kuutta eri strategiasta vastaavaa henkilöä. Tutkimuksen tuloksena löydettiin, että location intelligenceä voidaan hyödyntää strategisen päätöksenteon tukena usealla eri tasolla. Yksinkertaisimmassa karttaratkaisussa halutut tiedot tuodaan kartalle ja luodaan visuaalinen esitys, jonka avulla johtopäätöksien tekeminen helpottuu. Toisen tason karttaratkaisu pitää sisällään sekä sijainti- että ominaisuustietoa, jota on yhdistetty eri lähteistä. Tämä toisen tason karttaratkaisu on usein kuvailevaa analytiikkaa, joka mahdollistaa erilaisten ilmiöiden analysoinnin. Kolmannen eli ylimmän tason karttaratkaisu tarjoaa ennakoivaa analytiikkaa ja malleja tulevaisuudesta. Tällöin ohjelmaan koodataan älykkyyttä, jossa informaation keskinäisiä suhteita on määritelty joko tiedon louhintaa tai tilastollisia analyysejä hyödyntäen. Tutkimuksen johtopäätöksenä voidaan todeta, että location intelligence pystyy tarjoamaan lisäarvoa strategisen päätöksenteon tueksi, mikäli yritykselle on hyödyllistä ymmärtää eri ilmiöiden, asiakastarpeiden, kilpailijoiden ja markkinamuutoksien maantieteellisiä eroavaisuuksia. Parhaimmillaan location intelligence -ratkaisu tarjoaa luotettavan analyysin, jossa tieto välittyy muuttumattomana päätöksentekijältä toiselle ja johtopäätökseen johtaneita syitä on mahdollista palata tarkastelemaan tarvittaessa uudelleen.
Resumo:
The strongest wish of the customer concerning chemical pulp features is consistent, uniform quality. Variation may be controlled and reduced by using statistical methods. However, studies addressing the application and benefits of statistical methods in forest product sector are scarce. Thus, the customer wish is the root cause of the motivation behind this dissertation. The research problem addressed by this dissertation is that companies in the chemical forest product sector require new knowledge for improving their utilization of statistical methods. To gain this new knowledge, the research problem is studied from five complementary viewpoints – challenges and success factors, organizational learning, problem solving, economic benefit, and statistical methods as management tools. The five research questions generated on the basis of these viewpoints are answered in four research papers, which are case studies based on empirical data collection. This research as a whole complements the literature dealing with the use of statistical methods in the forest products industry. Practical examples of the application of statistical process control, case-based reasoning, the cross-industry standard process for data mining, and performance measurement methods in the context of chemical forest products manufacturing are brought to the public knowledge of the scientific community. The benefit of the application of these methods is estimated or demonstrated. The purpose of this dissertation is to find pragmatic ideas for companies in the chemical forest product sector in order for them to improve their utilization of statistical methods. The main practical implications of this doctoral dissertation can be summarized in four points: 1. It is beneficial to reduce variation in chemical forest product manufacturing processes 2. Statistical tools can be used to reduce this variation 3. Problem-solving in chemical forest product manufacturing processes can be intensified through the use of statistical methods 4. There are certain success factors and challenges that need to be addressed when implementing statistical methods
Resumo:
Tässä työssä käsitellään kävijäseurannan menetelmiä ja toteutetaan niitä käytännössä. Web-analytiikkaohjelmistojen toimintaan tutustutaan, pääasiassa keskittyen Google Analyticsiin. Tavoitteena on selvittää Lappeenrannan matkailulaitepäätteiden käyttömääriä ja eriyttää niitä laitekohtaisesti. Web-analytiikasta tehdään kirjallisuuskatsaus ja kävijäseurantadataa analysoidaan sekä vertaillaan kahdesta eri verkkosivustosta. Lisäksi matkailulaitepäätteiden verkkosivuston lokeja tarkastellaan tiedonlouhinnan keinoin tarkoitusta varten kehitetyllä Python-sovelluksella. Työn pohjalta voidaan todeta, ettei matkailulaitepäätteiden käyttömääriä voida nykyisen toteutuksen perusteella eriyttää laitekohtaisesti. Istuntojen määrää ja tapahtumia voidaan kuitenkin seurata. Matkailulaitepäätteiden kävijäseurannassa tunnistetaan useita ongelmia, kuten päätteiden automaattisen verkkosivunpäivityksen tuloksia vääristävä vaikutus, osittainen Google Analytics -integraatio ja tärkeimpänä päätteen yksilöivän tunnistetiedon puuttuminen. Työssä ehdotetaan ratkaisuja, joilla mahdollistetaan kävijäseurannan tehokas käyttö ja laitekohtainen seuranta. Saadut tulokset korostavat kävijäseurannan toteutuksen suunnitelmallisuuden tärkeyttä.
Resumo:
This research derived data from two sets of interviews with 18 participants who were involved in adult education in either a community college or a university. The purpose was to explore their worldview awareness. Through the interviews, the participants shared their understanding of worldview as a term and concept and as something that might be seen to apply in their practice of teaching. The responses indicated that there are three kinds of awareness (noetic, experiential, and integrative) which appeared to develop upon a landscape of constraints and opportunities. Constraints were seen to fall into the 5 broad categories of institutional, circumstantial, self-imposed, other-imposed, and discipline-related constraints. Opportunities for developing awareness were linked to individual experiences and could occur to different extents in many directions, on different occasions, and in different phases of life. Through this research, and in spite of the prevalence of worldview in the human experience, it was foimd that the term and concept have remained on the margins of educational discourse. Consequently, theory, research, and practice have been deprived of a useful and usable concept.
Resumo:
One of the most important problems in the theory of cellular automata (CA) is determining the proportion of cells in a specific state after a given number of time iterations. We approach this problem using patterns in preimage sets - that is, the set of blocks which iterate to the desired output. This allows us to construct a response curve - a relationship between the proportion of cells in state 1 after niterations as a function of the initial proportion. We derive response curve formulae for many two-dimensional deterministic CA rules with L-neighbourhood. For all remaining rules, we find experimental response curves. We also use preimage sets to classify surjective rules. In the last part of the thesis, we consider a special class of one-dimensional probabilistic CA rules. We find response surface formula for these rules and experimental response surfaces for all remaining rules.
Resumo:
Mobile augmented reality applications are increasingly utilized as a medium for enhancing learning and engagement in history education. Although these digital devices facilitate learning through immersive and appealing experiences, their design should be driven by theories of learning and instruction. We provide an overview of an evidence-based approach to optimize the development of mobile augmented reality applications that teaches students about history. Our research aims to evaluate and model the impacts of design parameters towards learning and engagement. The research program is interdisciplinary in that we apply techniques derived from design-based experiments and educational data mining. We outline the methodological and analytical techniques as well as discuss the implications of the anticipated findings.