922 resultados para Functional data analysis
Resumo:
A proportion of melanoma,prone individuals in both familial and non,familial contexts has been shown to carry inactivating mutations in either CDKN2A or, rarely, CDK4. CDKN2A is a complex locus that encodes two unrelated proteins from alternately spliced transcripts that are read in different frames. The alpha transcript (exons 1a, 2, and 3) produces the p16INK4A cyclin-dependent kinase inhibitor, while the beta transcript (exons 1beta and 2) is translated as p14ARF, a stabilizing factor of p53 levels through binding to MDM2. Mutations in exon 2 can impair both polypeptides and insertions and deletions in exons 1alpha, 1beta, and 2, which can theoretically generate p16INK4A,p14ARF fusion proteins. No online database currently takes into account all the consequences of these genotypes, a situation compounded by some problematic previous annotations of CDKN2A related sequences and descriptions of their mutations. As an initiative of the international Melanoma Genetics Consortium, we have therefore established a database of germline variants observed in all loci implicated in familial melanoma susceptibility. Such a comprehensive, publicly accessible database is an essential foundation for research on melanoma susceptibility and its clinical application. Our database serves two types of data as defined by HUGO. The core dataset includes the nucleotide variants on the genomic and transcript levels, amino acid variants, and citation. The ancillary dataset includes keyword description of events at the transcription and translation levels and epidemiological data. The application that handles users' queries was designed in the model,view. controller architecture and was implemented in Java. The object-relational database schema was deduced using functional dependency analysis. We hereby present our first functional prototype of eMelanoBase. The service is accessible via the URL www.wmi.usyd.e, du.au:8080/melanoma.html.
Resumo:
An increasing number of studies shows that the glycogen-accumulating organisms (GAOs) can survive and may indeed proliferate under the alternating anaerobic/aerobic conditions found in EBPR systems, thus forming a strong competitor of the polyphosphate-accumulating organisms (PAOs). Understanding their behaviors in a mixed PAO and GAO culture under various operational conditions is essential for developing operating strategies that disadvantage the growth of this group of unwanted organisms. A model-based data analysis method is developed in this paper for the study of the anaerobic PAO and GAO activities in a mixed PAO and GAO culture. The method primarily makes use of the hydrogen ion production rate and the carbon dioxide transfer rate resulting from the acetate uptake processes by PAOs and GAOs, measured with a recently developed titration and off-gas analysis (TOGA) sensor. The method is demonstrated using the data from a laboratory-scale sequencing batch reactor (SBR) operated under alternating anaerobic and aerobic conditions. The data analysis using the proposed method strongly indicates a coexistence of PAOs and GAOs in the system, which was independently confirmed by fluorescent in situ hybridization (FISH) measurement. The model-based analysis also allowed the identification of the respective acetate uptake rates by PAOs and GAOs, along with a number of kinetic and stoichiometric parameters involved in the PAO and GAO models. The excellent fit between the model predictions and the experimental data not involved in parameter identification shows that the parameter values found are reliable and accurate. It also demonstrates that the current anaerobic PAO and GAO models are able to accurately characterize the PAO/GAO mixed culture obtained in this study. This is of major importance as no pure culture of either PAOs or GAOs has been reported to date, and hence the current PAO and GAO models were developed for the interpretation of experimental results of mixed cultures. The proposed method is readily applicable for detailed investigations of the competition between PAOs and GAOs in enriched cultures. However, the fermentation of organic substrates carried out by ordinary heterotrophs needs to be accounted for when the method is applied to the study of PAO and GAO competition in full-scale sludges. (C) 2003 Wiley Periodicals, Inc.
Resumo:
O objectivo deste trabalho é a análise da eficiência produtiva e dos efeitos da concentração sobre os custos bancários, tendo por base a indústria bancária portuguesa. O carácter multiproduto da empresa bancária sugere a necessidade de se adoptar formas multiproduto da função custo (tipo Fourier). Introduzimos variáveis de homogeneidade e de estrutura que permitem o recurso a formas funcionais uniproduto (Cobb-Douglas) à banca. A amostra corresponde a 22 bancos que operavam em Portugal entre 1995-2001, base não consolidada e dados em painel. Para o estudo da ineficiência recorreu-se ao modelo estocástico da curva fronteira (SFA), para as duas especificações. Na análise da concentração, introduziram-se variáveis binárias que pretendem captar os efeitos durante quatro anos após a concentração. Tanto no caso da SFA como no da concentração, os resultados encontrados são sensíveis à especificação funcional adoptada. Concluindo, o processo de concentração bancário parece justificar-se pela possibilidade da diminuição da ineficiência-X. This study addresses the productive efficiency and the effects of concentration over the banking costs, stressing its focus on the Portuguese banking market. The multiproduct character of the banking firm suggests the use of functional forms as Fourier. The introduction of variables of structure and of homogeneity allows the association of the banking activity (multiproduct) with a single product function (Cobb-Douglas type). The sample covers 22 banks which operated in Portugal from 1995-2001, non consolidated base with a panel data structure. The study about inefficiency is elaborated through the stochastic frontier model (SFA), for the two specifications selected. As a methodology to analyze the concentration, we introduced binary variables, which intend to catch the effects through four years after the concentration process. The results obtained, through SFA and concentration approach, are influenced by the kind of specifications selected. Summing up, the concentration process of the Banking Industry sounds to be justified by the possibility of the X-inefficiency.
Resumo:
Objectives : The purpose of this article is to find out differences between surveys using paper and online questionnaires. The author has deep knowledge in the case of questions concerning opinions in the development of survey based research, e.g. the limits of postal and online questionnaires. Methods : In the physician studies carried out in 1995 (doctors graduated in 1982-1991), 2000 (doctors graduated in 1982-1996), 2005 (doctors graduated in 1982-2001), 2011 (doctors graduated in 1977-2006) and 457 family doctors in 2000, were used paper and online questionnaires. The response rates were 64%, 68%, 64%, 49% and 73%, respectively. Results : The results of the physician studies showed that there were differences between methods. These differences were connected with using paper-based questionnaire and online questionnaire and response rate. The online-based survey gave a lower response rate than the postal survey. The major advantages of online survey were short response time; very low financial resource needs and data were directly loaded in the data analysis software, thus saved time and resources associated with the data entry process. Conclusions : The current article helps researchers with planning the study design and choosing of the right data collection method.
Resumo:
This article is is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. Attribution-NonCommercial (CC BY-NC) license lets others remix, tweak, and build upon work non-commercially, and although the new works must also acknowledge & be non-commercial.
Resumo:
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Catastrophic events, such as wars and terrorist attacks, tornadoes and hurricanes, earthquakes, tsunamis, floods and landslides, are always accompanied by a large number of casualties. The size distribution of these casualties has separately been shown to follow approximate power law (PL) distributions. In this paper, we analyze the statistical distributions of the number of victims of catastrophic phenomena, in particular, terrorism, and find double PL behavior. This means that the data sets are better approximated by two PLs instead of a single one. We plot the PL parameters, corresponding to several events, and observe an interesting pattern in the charts, where the lines that connect each pair of points defining the double PLs are almost parallel to each other. A complementary data analysis is performed by means of the computation of the entropy. The results reveal relationships hidden in the data that may trigger a future comprehensive explanation of this type of phenomena.
Resumo:
Nas últimas décadas assistimos a transformações económicas, tecnológicas, políticas e sociais, que influenciaram diretamente o modo de pensar e agir nas organizações. O conceito de competências, com uma valorização crescente, surge como uma alternativa à abordagem da gestão de recursos humanos por funções, respondendo aos desafios atuais do mercado: necessidade de flexibilidade, de adaptação a mudanças contínuas, exigências crescentes do mercado e competitividade das organizações nesse mercado. A área da saúde, e concretamente a profissão de Enfermagem também tem evoluído, surgindo em 2009, uma nova forma de operacionalizar a carreira destes profissionais. No que diz respeito aos enfermeiros com funções de gestão, o conteúdo funcional está descrito, contudo, não existe uma definição clara das competências requeridas para estes profissionais. Este trabalho de investigação, de cariz exploratório, utilizando uma metodologia qualitativa, pretendeu propor uma estratégia de definição de um modelo de competências para os enfermeiros com funções de gestão em Portugal. Para isso, definimos categorias de competências, através da análise da literatura e da legislação. Seguiu-se a realização de entrevistas a um painel de doze peritos, e uma análise de conteúdo dos dados (categorização do tipo misto). Procedemos a uma comparação da recolha empírica de competências com as da recolha teórica, e definimos uma lista de 10 competências para as funções de gestão dos enfermeiros: Competências Técnicas de Gestão; Competências Interpessoais; Comunicação; Gestão de Recursos Humanos; Pensamento Crítico; Conhecimento de Políticas de Saúde; Competências Técnicas de Enfermagem; Organização e Planeamento; Trabalho de Equipa; Preocupação pela Qualidade. De forma a complementar o estudo, pretendemos identificar a perceção das lacunas de competências nos enfermeiros com funções de gestão, e identificar os processos de desenvolvimento de competências considerados mais relevantes para estes profissionais. As lacunas identificadas nas competências dos atuais enfermeiros com funções de gestão, face às mais valorizadas, são reduzidas e dispersas, pelo que consideramos pouco significativas. A forma de desenvolvimento de competências mais valorizado pelo painel de peritos foi a formação (académica e em contexto profissional). Foi também realçada a importância do empenho individual neste processo, assim como a avaliação de competências antes dos enfermeiros assumirem funções de gestão.Consideramos que esta investigação traz contributos quer para a literatura da Gestão por Competências, quer para a literatura da definição de competências das funções dos enfermeiros com funções de gestão, quer para a profissão de enfermagem, (nomeadamente, para as funções de gestão dos enfermeiros), quer para o próprio SNS, já que faz algumas propostas e sugestões para a evolução das práticas de gestão de pessoas.
Resumo:
Estuaries are perhaps the most threatened environments in the coastal fringe; the coincidence of high natural value and attractiveness for human use has led to conflicts between conservation and development. These conflicts occur in the Sado Estuary since its location is near the industrialised zone of Peninsula of Setúbal and at the same time, a great part of the Estuary is classified as a Natural Reserve due to its high biodiversity. These facts led us to the need of implementing a model of environmental management and quality assessment, based on methodologies that enable the assessment of the Sado Estuary quality and evaluation of the human pressures in the estuary. These methodologies are based on indicators that can better depict the state of the environment and not necessarily all that could be measured or analysed. Sediments have always been considered as an important temporary source of some compounds or a sink for other type of materials or an interface where a great diversity of biogeochemical transformations occur. For all this they are of great importance in the formulation of coastal management system. Many authors have been using sediments to monitor aquatic contamination, showing great advantages when compared to the sampling of the traditional water column. The main objective of this thesis was to develop an estuary environmental management framework applied to Sado Estuary using the DPSIR Model (EMMSado), including data collection, data processing and data analysis. The support infrastructure of EMMSado were a set of spatially contiguous and homogeneous regions of sediment structure (management units). The environmental quality of the estuary was assessed through the sediment quality assessment and integrated in a preliminary stage with the human pressure for development. Besides the earlier explained advantages, studying the quality of the estuary mainly based on the indicators and indexes of the sediment compartment also turns this methodology easier, faster and human and financial resource saving. These are essential factors to an efficient environmental management of coastal areas. Data management, visualization, processing and analysis was obtained through the combined use of indicators and indices, sampling optimization techniques, Geographical Information Systems, remote sensing, statistics for spatial data, Global Positioning Systems and best expert judgments. As a global conclusion, from the nineteen management units delineated and analyzed three showed no ecological risk (18.5 % of the study area). The areas of more concern (5.6 % of the study area) are located in the North Channel and are under strong human pressure mainly due to industrial activities. These areas have also low hydrodynamics and are, thus associated with high levels of deposition. In particular the areas near Lisnave and Eurominas industries can also accumulate the contamination coming from Águas de Moura Channel, since particles coming from that channel can settle down in that area due to residual flow. In these areas the contaminants of concern, from those analyzed, are the heavy metals and metalloids (Cd, Cu, Zn and As exceeded the PEL guidelines) and the pesticides BHC isomers, heptachlor, isodrin, DDT and metabolits, endosulfan and endrin. In the remain management units (76 % of the study area) there is a moderate impact potential of occurrence of adverse ecological effects and in some of these areas no stress agents could be identified. This emphasizes the need for further research, since unmeasured chemicals may be causing or contributing to these adverse effects. Special attention must be taken to the units with moderate impact potential of occurrence of adverse ecological effects, located inside the natural reserve. Non-point source pollution coming from agriculture and aquaculture activities also seem to contribute with important pollution load into the estuary entering from Águas de Moura Channel. This pressure is expressed in a moderate impact potential for ecological risk existent in the areas near the entrance of this Channel. Pressures may also came from Alcácer Channel although they were not quantified in this study. The management framework presented here, including all the methodological tools may be applied and tested in other estuarine ecosystems, which will also allow a comparison between estuarine ecosystems in other parts of the globe.
Resumo:
Eight depositional sequences (DS) delimited by regional disconformities had been recognized in the Miocene of Lisbon and Setúbal Peninsula areas. In the case of the western coast of the Setúbal Peninsula, outcrops consisting of Lower Burdigalian to Lower Tortonian sediments were studied. The stratigraphic zonography and the environmental considerations are mainly supported on data concerning to foraminifera, ostracoda, vertebrates and palynomorphs. The first mineralogical and geochemical data determined for Foz da Fonte, Penedo Sul and Penedo Norte sedimentary sequences are presented. These analytical data mainly correspond to the sediments' fine fractions. Mineralogical data are based on X-ray diffraction (XRD), carried out on both the less than 38 nm and 2 nm fractions. Qualitative and semi-quantitative determinations of clay and non-clay minerals were obtained for both fractions. The clay minerals assemblages complete the lithostratigraphic and paleoenvironmental data obtained by stratigraphic and palaeontological studies. Some palaeomagnetic and isotopic data are discussed and correlated with the mineralogical data. Multivariate data analysis (Principal Components Analysis) of the mineralogical data was carried out using both R-mode and Q-mode factor analysis.
Resumo:
Mestrado em Engenharia Informática - Área de Especialização em Tecnologias do Conhecimento e Decisão
Resumo:
This paper presents the Realistic Scenarios Generator (RealScen), a tool that processes data from real electricity markets to generate realistic scenarios that enable the modeling of electricity market players’ characteristics and strategic behavior. The proposed tool provides significant advantages to the decision making process in an electricity market environment, especially when coupled with a multi-agent electricity markets simulator. The generation of realistic scenarios is performed using mechanisms for intelligent data analysis, which are based on artificial intelligence and data mining algorithms. These techniques allow the study of realistic scenarios, adapted to the existing markets, and improve the representation of market entities as software agents, enabling a detailed modeling of their profiles and strategies. This work contributes significantly to the understanding of the interactions between the entities acting in electricity markets by increasing the capability and realism of market simulations.
Resumo:
A thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems.
Resumo:
Harnessing idle PCs CPU cycles, storage space and other resources of networked computers to collaborative are mainly fixated on for all major grid computing research projects. Most of the university computers labs are occupied with the high puissant desktop PC nowadays. It is plausible to notice that most of the time machines are lying idle or wasting their computing power without utilizing in felicitous ways. However, for intricate quandaries and for analyzing astronomically immense amounts of data, sizably voluminous computational resources are required. For such quandaries, one may run the analysis algorithms in very puissant and expensive computers, which reduces the number of users that can afford such data analysis tasks. Instead of utilizing single expensive machines, distributed computing systems, offers the possibility of utilizing a set of much less expensive machines to do the same task. BOINC and Condor projects have been prosperously utilized for solving authentic scientific research works around the world at a low cost. In this work the main goal is to explore both distributed computing to implement, Condor and BOINC, and utilize their potency to harness the ideal PCs resources for the academic researchers to utilize in their research work. In this thesis, Data mining tasks have been performed in implementation of several machine learning algorithms on the distributed computing environment.