74 resultados para Data Standards

em Universidade do Minho


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to create safer schools, the Chilean authorities published a Standard regarding school furniture dimensions. The aims of this study are twofold: to verify the existence of positive secular trend within the Chilean student population and to evaluate the potential mismatch between the anthropometric characteristics and the school furniture dimensions defined by the mentioned standard. The sample consists of 3078 subjects. Eight anthropometric measures were gathered, together with six furniture dimensions from the mentioned standard. There is an average increase for some dimensions within the Chilean student population over the past two decades. Accordingly, almost 18% of the students will find the seat height to be too high. Seat depth will be considered as being too shallow for 42.8% of the students. It can be concluded that the Chilean student population has increased in stature, which supports the need to revise and update the data from the mentioned Standard. Practitioner Summary: Positive secular trend resulted in high levels of mismatch if furniture is selected according to the current Chilean Standard which uses data collected more than 20 years ago. This study shows that school furniture standards need to be updated over time.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transcriptional Regulatory Networks (TRNs) are powerful tool for representing several interactions that occur within a cell. Recent studies have provided information to help researchers in the tasks of building and understanding these networks. One of the major sources of information to build TRNs is biomedical literature. However, due to the rapidly increasing number of scientific papers, it is quite difficult to analyse the large amount of papers that have been published about this subject. This fact has heightened the importance of Biomedical Text Mining approaches in this task. Also, owing to the lack of adequate standards, as the number of databases increases, several inconsistencies concerning gene and protein names and identifiers are common. In this work, we developed an integrated approach for the reconstruction of TRNs that retrieve the relevant information from important biological databases and insert it into a unique repository, named KREN. Also, we applied text mining techniques over this integrated repository to build TRNs. However, was necessary to create a dictionary of names and synonyms associated with these entities and also develop an approach that retrieves all the abstracts from the related scientific papers stored on PubMed, in order to create a corpora of data about genes. Furthermore, these tasks were integrated into @Note, a software system that allows to use some methods from the Biomedical Text Mining field, including an algorithms for Named Entity Recognition (NER), extraction of all relevant terms from publication abstracts, extraction relationships between biological entities (genes, proteins and transcription factors). And finally, extended this tool to allow the reconstruction Transcriptional Regulatory Networks through using scientific literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

\The idea that social processes develop in a cyclical manner is somewhat like a `Lorelei'. Researchers are lured to it because of its theoretical promise, only to become entangled in (if not wrecked by) messy problems of empirical inference. The reasoning leading to hypotheses of some kind of cycle is often elegant enough, yet the data from repeated observations rarely display the supposed cyclical pattern. (...) In addition, various `schools' seem to exist which frequently arrive at di erent conclusions on the basis of the same data." (van der Eijk and Weber 1987:271). Much of the empirical controversies around these issues arise because of three distinct problems: the coexistence of cycles of di erent periodicities, the possibility of transient cycles and the existence of cycles without xed periodicity. In some cases, there are no reasons to expect any of these phenomena to be relevant. Seasonality caused by Christmas is one such example (Wen 2002). In such cases, researchers mostly rely on spectral analysis and Auto-Regressive Moving-Average (ARMA) models to estimate the periodicity of cycles.1 However, and this is particularly true in social sciences, sometimes there are good theoretical reasons to expect irregular cycles. In such cases, \the identi cation of periodic movement in something like the vote is a daunting task all by itself. When a pendulum swings with an irregular beat (frequency), and the extent of the swing (amplitude) is not constant, mathematical functions like sine-waves are of no use."(Lebo and Norpoth 2007:73) In the past, this di culty has led to two di erent approaches. On the one hand, some researchers dismissed these methods altogether, relying on informal alternatives that do not meet rigorous standards of statistical inference. Goldstein (1985 and 1988), studying the severity of Great power wars is one such example. On the other hand, there are authors who transfer the assumptions of spectral analysis (and ARMA models) into fundamental assumptions about the nature of social phenomena. This type of argument was produced by Beck (1991) who, in a reply to Goldstein (1988), claimed that only \ xed period models are meaningful models of cyclic phenomena".We argue that wavelet analysis|a mathematical framework developed in the mid-1980s (Grossman and Morlet 1984; Goupillaud et al. 1984) | is a very viable alternative to study cycles in political time-series. It has the advantage of staying close to the frequency domain approach of spectral analysis while addressing its main limitations. Its principal contribution comes from estimating the spectral characteristics of a time-series as a function of time, thus revealing how its di erent periodic components may change over time. The rest of article proceeds as follows. In the section \Time-frequency Analysis", we study in some detail the continuous wavelet transform and compare its time-frequency properties with the more standard tool for that purpose, the windowed Fourier transform. In the section \The British Political Pendulum", we apply wavelet analysis to essentially the same data analyzed by Lebo and Norpoth (2007) and Merrill, Grofman and Brunell (2011) and try to provide a more nuanced answer to the same question discussed by these authors: do British electoral politics exhibit cycles? Finally, in the last section, we present a concise list of future directions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As huge amounts of data become available in organizations and society, specific data analytics skills and techniques are needed to explore this data and extract from it useful patterns, tendencies, models or other useful knowledge, which could be used to support the decision-making process, to define new strategies or to understand what is happening in a specific field. Only with a deep understanding of a phenomenon it is possible to fight it. In this paper, a data-driven analytics approach is used for the analysis of the increasing incidence of fatalities by pneumonia in the Portuguese population, characterizing the disease and its incidence in terms of fatalities, knowledge that can be used to define appropriate strategies that can aim to reduce this phenomenon, which has increased more than 65% in a decade.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a methodology based on the Bayesian data fusion techniques applied to non-destructive and destructive tests for the structural assessment of historical constructions. The aim of the methodology is to reduce the uncertainties of the parameter estimation. The Young's modulus of granite stones was chosen as an example for the present paper. The methodology considers several levels of uncertainty since the parameters of interest are considered random variables with random moments. A new concept of Trust Factor was introduced to affect the uncertainty related to each test results, translated by their standard deviation, depending on the higher or lower reliability of each test to predict a certain parameter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hospitals are nowadays collecting vast amounts of data related with patient records. All this data hold valuable knowledge that can be used to improve hospital decision making. Data mining techniques aim precisely at the extraction of useful knowledge from raw data. This work describes an implementation of a medical data mining project approach based on the CRISP-DM methodology. Recent real-world data, from 2000 to 2013, were collected from a Portuguese hospital and related with inpatient hospitalization. The goal was to predict generic hospital Length Of Stay based on indicators that are commonly available at the hospitalization process (e.g., gender, age, episode type, medical specialty). At the data preparation stage, the data were cleaned and variables were selected and transformed, leading to 14 inputs. Next, at the modeling stage, a regression approach was adopted, where six learning methods were compared: Average Prediction, Multiple Regression, Decision Tree, Artificial Neural Network ensemble, Support Vector Machine and Random Forest. The best learning model was obtained by the Random Forest method, which presents a high quality coefficient of determination value (0.81). This model was then opened by using a sensitivity analysis procedure that revealed three influential input attributes: the hospital episode type, the physical service where the patient is hospitalized and the associated medical specialty. Such extracted knowledge confirmed that the obtained predictive model is credible and with potential value for supporting decisions of hospital managers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is a difficult task to avoid the “smart systems” topic when discussing smart prevention and, similarly, it is a difficult task to address smart systems without focusing their ability to learn. Following the same line of thought, in the current reality, it seems a Herculean task (or an irreparable omission) to approach the topic of certified occupational health and safety management systems (OHSMS) without discussing the integrated management systems (IMSs). The available data suggest that seldom are the OHSMS operating as the single management system (MS) in a company so, any statement concerning OHSMS should mainly be interpreted from an integrated perspective. A major distinction between generic systems can be drawn between those that learn, i.e., those systems that have “memory” and those that have not. These former systems are often depicted as adaptive since they take into account past events to deal with novel, similar and future events modifying their structure to enable success in its environment. Often, these systems, present a nonlinear behavior and a huge uncertainty related to the forecasting of some events. This paper seeks to portray, for the first time as we were able to find out, the IMSs as complex adaptive systems (CASs) by listing their properties and dissecting the features that enable them to evolve and self-organize in order to, holistically, fulfil the requirements from different stakeholders and thus thrive by assuring the successful sustainability of a company. Based on the revision of literature carried out, this is the first time that IMSs are pointed out as CASs which may develop fruitful synergies both for the MSs and for CASs communities. By performing a thorough revision of literature and based on some concepts embedded in the “DNA” of the subsystems implementation standards it is intended, specifically, to identify, determine and discuss the properties of a generic IMS that should be considered to classify it as a CAS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Earthworks tasks aim at levelling the ground surface at a target construction area and precede any kind of structural construction (e.g., road and railway construction). It is comprised of sequential tasks, such as excavation, transportation, spreading and compaction, and it is strongly based on heavy mechanical equipment and repetitive processes. Under this context, it is essential to optimize the usage of all available resources under two key criteria: the costs and duration of earthwork projects. In this paper, we present an integrated system that uses two artificial intelligence based techniques: data mining and evolutionary multi-objective optimization. The former is used to build data-driven models capable of providing realistic estimates of resource productivity, while the latter is used to optimize resource allocation considering the two main earthwork objectives (duration and cost). Experiments held using real-world data, from a construction site, have shown that the proposed system is competitive when compared with current manual earthwork design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese de Doutoramento em Ciências da Educação (área de especialização em Tecnologia Educativa)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We are living in the era of Big Data. A time which is characterized by the continuous creation of vast amounts of data, originated from different sources, and with different formats. First, with the rise of the social networks and, more recently, with the advent of the Internet of Things (IoT), in which everyone and (eventually) everything is linked to the Internet, data with enormous potential for organizations is being continuously generated. In order to be more competitive, organizations want to access and explore all the richness that is present in those data. Indeed, Big Data is only as valuable as the insights organizations gather from it to make better decisions, which is the main goal of Business Intelligence. In this paper we describe an experiment in which data obtained from a NoSQL data source (database technology explicitly developed to deal with the specificities of Big Data) is used to feed a Business Intelligence solution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Studies in Computational Intelligence, 616

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Os recursos computacionais exigidos durante o processamento de grandes volumes de dados durante um processo de povoamento de um data warehouse faz com que a necessidade da procura de novas implementações tenha também em atenção a eficiência energética dos diversos componentes processuais que integram um qualquer sistema de povoamento. A lacuna de técnicas ou metodologias para categorizar e avaliar o consumo de energia em sistemas de povoamento de data warehouses é claramente notória. O acesso a esse tipo de informação possibilitaria a construção de sistemas de povoamento de data warehouses com níveis de consumo de energia mais baixos e, portanto, mais eficientes. Partindo da adaptação de técnicas aplicadas a sistemas de gestão de base de dados para a obtenção dos consumos energéticos da execução de interrogações, desenhámos e implementámos uma nova técnica que nos permite obter os consumos de energia para um qualquer processo de povoamento de um data warehouse, através da avaliação do consumo de cada um dos componentes utilizados na sua implementação utilizando uma ferramenta convencional. Neste artigo apresentamos a forma como fazemos tal avaliação, utilizando na demonstração da viabilidade da nossa proposta um processo de povoamento bastante típico em data warehouses – substituição encadeada de chaves operacionais -, que foi implementado através da ferramenta Kettle.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Worldwide, around 9% of the children are born with less than 37 weeks of labour, causing risk to the premature child, whom it is not prepared to develop a number of basic functions that begin soon after the birth. In order to ensure that those risk pregnancies are being properly monitored by the obstetricians in time to avoid those problems, Data Mining (DM) models were induced in this study to predict preterm births in a real environment using data from 3376 patients (women) admitted in the maternal and perinatal care unit of Centro Hospitalar of Oporto. A sensitive metric to predict preterm deliveries was developed, assisting physicians in the decision-making process regarding the patients’ observation. It was possible to obtain promising results, achieving sensitivity and specificity values of 96% and 98%, respectively.