84 resultados para Markov Clustering, GPI Computing, PPI Networks, CUDA, ELPACK-R Sparse Format, Parallel Computing
Resumo:
The Graphics Processing Unit (GPU) is present in almost every modern day personal computer. Despite its specific purpose design, they have been increasingly used for general computations with very good results. Hence, there is a growing effort from the community to seamlessly integrate this kind of devices in everyday computing. However, to fully exploit the potential of a system comprising GPUs and CPUs, these devices should be presented to the programmer as a single platform. The efficient combination of the power of CPU and GPU devices is highly dependent on each device’s characteristics, resulting in platform specific applications that cannot be ported to different systems. Also, the most efficient work balance among devices is highly dependable on the computations to be performed and respective data sizes. In this work, we propose a solution for heterogeneous environments based on the abstraction level provided by algorithmic skeletons. Our goal is to take full advantage of the power of all CPU and GPU devices present in a system, without the need for different kernel implementations nor explicit work-distribution.To that end, we extended Marrow, an algorithmic skeleton framework for multi-GPUs, to support CPU computations and efficiently balance the work-load between devices. Our approach is based on an offline training execution that identifies the ideal work balance and platform configurations for a given application and input data size. The evaluation of this work shows that the combination of CPU and GPU devices can significantly boost the performance of our benchmarks in the tested environments, when compared to GPU-only executions.
Resumo:
Human Activity Recognition systems require objective and reliable methods that can be used in the daily routine and must offer consistent results according with the performed activities. These systems are under development and offer objective and personalized support for several applications such as the healthcare area. This thesis aims to create a framework for human activities recognition based on accelerometry signals. Some new features and techniques inspired in the audio recognition methodology are introduced in this work, namely Log Scale Power Bandwidth and the Markov Models application. The Forward Feature Selection was adopted as the feature selection algorithm in order to improve the clustering performances and limit the computational demands. This method selects the most suitable set of features for activities recognition in accelerometry from a 423th dimensional feature vector. Several Machine Learning algorithms were applied to the used accelerometry databases – FCHA and PAMAP databases - and these showed promising results in activities recognition. The developed algorithm set constitutes a mighty contribution for the development of reliable evaluation methods of movement disorders for diagnosis and treatment applications.
Resumo:
Complex systems, i.e. systems composed of a large set of elements interacting in a non-linear way, are constantly found all around us. In the last decades, different approaches have been proposed toward their understanding, one of the most interesting being the Complex Network perspective. This legacy of the 18th century mathematical concepts proposed by Leonhard Euler is still current, and more and more relevant in real-world problems. In recent years, it has been demonstrated that network-based representations can yield relevant knowledge about complex systems. In spite of that, several problems have been detected, mainly related to the degree of subjectivity involved in the creation and evaluation of such network structures. In this Thesis, we propose addressing these problems by means of different data mining techniques, thus obtaining a novel hybrid approximation intermingling complex networks and data mining. Results indicate that such techniques can be effectively used to i) enable the creation of novel network representations, ii) reduce the dimensionality of analyzed systems by pre-selecting the most important elements, iii) describe complex networks, and iv) assist in the analysis of different network topologies. The soundness of such approach is validated through different validation cases drawn from actual biomedical problems, e.g. the diagnosis of cancer from tissue analysis, or the study of the dynamics of the brain under different neurological disorders.
Resumo:
Saccharomyces cerevisiae as well as other microorganisms are frequently used in industry with the purpose of obtain different kind of products that can be applied in several areas (research investigation, pharmaceutical compounds, etc.). In order to obtain high yields for the desired product, it is necessary to make an adequate medium supplementation during the growth of the microorganisms. The higher yields are typically reached by using complex media, however the exact formulation of these media is not known. Moreover, it is difficult to control the exact composition of complex media, leading to batch-to-batch variations. So, to overcome this problem, some industries choose to use defined media, with a defined and known chemical composition. However these kind of media, many times, do not reach the same high yields that are obtained by using complex media. In order to obtain similar yield with defined media the addition of many different compounds has to be tested experimentally. Therefore, the industries use a set of empirical methods with which it is tried to formulate defined media that can reach the same high yields as complex media. In this thesis, a defined medium for Saccharomyces cerevisiae was developed using a rational design approach. In this approach a given metabolic network of Saccharomyces cerevisiae is divided into a several unique and not further decomposable sub networks of metabolic reactions that work coherently in steady state, so called elementary flux modes. The EFMtool algorithm was used in order to calculate the EFM’s for two Saccharomyces cerevisiae metabolic networks (amino acids supplemented metabolic network; amino acids non-supplemented metabolic network). For the supplemented metabolic network 1352172 EFM’s were calculated and then divided into: 1306854 EFM’s producing biomass, and 18582 EFM’s exclusively producing CO2 (cellular respiration). For the non-supplemented network 635 EFM’s were calculated and then divided into: 215 EFM’s producing biomass; 420 EFM’s producing exclusively CO2. The EFM’s of each group were normalized by the respective glucose consumption value. After that, the EFMs’ of the supplemented network were grouped again into: 30 clusters for the 1306854 EFMs producing biomass and, 20 clusters for the 18582 EFM’s producing CO2. For the non-supplemented metabolic network the respective EFM’s of each metabolic function were grouped into 10 clusters. After the clustering step, the concentrations of the other medium compounds were calculated by considering a reasonable glucose amount and by accounting for the proportionality between the compounds concentrations and the glucose ratios. The approach adopted/developed in this thesis may allow a faster and more economical way for media development.
Resumo:
Botnets are a group of computers infected with a specific sub-set of a malware family and controlled by one individual, called botmaster. This kind of networks are used not only, but also for virtual extorsion, spam campaigns and identity theft. They implement different types of evasion techniques that make it harder for one to group and detect botnet traffic. This thesis introduces one methodology, called CONDENSER, that outputs clusters through a self-organizing map and that identify domain names generated by an unknown pseudo-random seed that is known by the botnet herder(s). Aditionally DNS Crawler is proposed, this system saves historic DNS data for fast-flux and double fastflux detection, and is used to identify live C&Cs IPs used by real botnets. A program, called CHEWER, was developed to automate the calculation of the SVM parameters and features that better perform against the available domain names associated with DGAs. CONDENSER and DNS Crawler were developed with scalability in mind so the detection of fast-flux and double fast-flux networks become faster. We used a SVM for the DGA classififer, selecting a total of 11 attributes and achieving a Precision of 77,9% and a F-Measure of 83,2%. The feature selection method identified the 3 most significant attributes of the total set of attributes. For clustering, a Self-Organizing Map was used on a total of 81 attributes. The conclusions of this thesis were accepted in Botconf through a submited article. Botconf is known conferênce for research, mitigation and discovery of botnets tailled for the industry, where is presented current work and research. This conference is known for having security and anti-virus companies, law enforcement agencies and researchers.
Resumo:
RESUMO:O glicosilfosfatidilinositol (GPI) é um complexo glicolipídico utlizado por dezenas de proteínas, o qual medeia a sua ancoragem à superfície da célula. Proteínas de superfície celular ancoradas a GPI apresentam várias funções essenciais para a manutenção celular. A deficiência na síntese de GPI é o que caracteriza principalmente a deficiência hereditária em GPI, um grupo de doenças autossómicas raras que resultam de mutações nos genes PIGA, PIGL, PIGM, PIGV, PIGN, PIGO e PIGT, os quais sao indispensáveis para a biossíntese do GPI. Uma mutação pontual no motivo rico em GC -270 no promotor de PIGM impede a ligação do factor de transcrição (FT) Sp1 à sua sequência de reconhecimento, impondo a compactação da cromatina, associada à hipoacetilação de histonas, e consequentemente, impedindo a transcrição de PIGM. Desta forma, a adição da primeira manose ao GPI é comprometida, a síntese de GPI diminui assim como as proteínas ligadas a GPI à superficie das células. Pacientes com Deficiência Hereditária em GPI-associada a PIGM apresentam trombose e epilesia, e ausência de hemólise intravascular e anemia, sendo que estas duas últimas características definem a Hemoglobinúria Paroxística Nocturna (HPN), uma doença rara causada por mutações no gene PIGA. Embora a mutação que causa IGD seja constitutiva e esteja presente em todos os tecidos, o grau de deficiência em GPI varia entre células do mesmo tecido e entre células de tecidos diferentes. Por exemplo nos granulócitos e linfócitos B a deficiência em GPI é muito acentuada mas nos linfócitos T, fibroblastos, plaquetas e eritrócitos é aproximadamente normal, daí a ausência de hemólise intravascular. Os eventos transcricionais que estão na base da expressão diferencial da âncora GPI nas células hematopoiéticas são desconhecidos e constituem o objectivo geral desta tese. Em primeiro lugar, os resultados demonstraram que os níveis de PIGM mRNA variam entre células primárias hematopoiéticas normais. Adicionalmente, a configuração dos nucleossomas no promotor de PIGM é mais compacta em células B do que em células eritróides e tal está correlacionado com os níveis de expressão de PIGM, isto é, inferior nas células B. A presença de vários motivos de ligação para o FT específico da linhagem megacariocítica-eritróide GATA-1 no promotor de PIGM sugeriu que GATA-1 desempenha um papel regulador na sua transcrição. Os resultados mostraram que muito possivelmente GATA-1 desempenha um papel repressor em vez de activador da expressão de PIGM. Resultados preliminares sugerem que KLF1, um factor de transcrição restritamente eritróide, regula a transcrição de PIGM independentemente do motivo -270GC. Em segundo lugar, a investigação do papel dos FTs Sp demonstrou que Sp1 medeia directamente a transcrição de PIGM em ambas as células B e eritróide. Curiosamente, ao contrário do que acontece nas células B, em que a transcrição de PIGM requer a ligação do FT geral Sp1 ao motivo -270GC, nas células eritróides Sp1 regula a transcrição de PIGM ao ligar-se a montante e não ao motivo -270GC. Para além disso, demonstrou-se que Sp2 não é um regulador directo da transcrição de PIGM quer nas células B quer nas células eritróides. Estes resultados explicam a ausência de hemólise intravascular nos doentes com IGD associada a PIGM, uma das principais características que define a HPN. Por último, resultados preliminares mostraram que a repressão da transcrição de PIGM devida à mutação patogénica -270C>G está associada com a diminuição da frequência de interacções genómicas em cis entre PIGM e os seus genes “vizinhos”, sugerindo adicionalmente que a regulação de PIGM e desses genes é partilhada. No seu conjunto, os resultados apresentados nesta tese contribuem para o conhecimento do controlo transcricional de um gene housekeeping, específico-detecido, por meio de FTs genéricos e específicos de linhagem.-------------ABSTRACTC: Glycosylphosphatidylinositol (GPI) is a complex glycolipid used by dozens of proteins for cell surface anchoring. GPI-anchored proteins have various functions that are essential for the cellular maintenance. Defective GPI biosynthesis is the hallmark of inherited GPI deficiency (IGD), a group of rare autosomal diseases caused by mutations in PIGA, PIGL, PIGM, PIGV, PIGN, PIGO and PIGT, all genes indispensable for GPI biosynthesis. A point mutation in the -270GC-rich box in the core promoter of PIGM disrupts binding of the transcription factor (TF) Sp1 to it, imposing nucleosome compaction associated with histone hypoacetylation, thus abrogating transcription of PIGM. As a consequence of PIGM transcriptional repression, addition of the first mannose residue onto the GPI core and thus GPI production are impaired; and expression of GPI-anchored proteins on the surface of cells is severely impaired. Patients with PIGM-associated IGD suffer from life-threatening thrombosis and epilepsy but not intravascular haemolysis and anaemia, two defining features of paroxysmal nocturnal haemoglobinuria (PNH), a rare disease caused by somatic mutations in PIGA. Although the disease-causing mutation in IGD is constitutional and present in all tissues, the degree of GPI deficiency is variable and differs between cells of the same and of different tissues. Accordingly, GPI deficiency is severe in granulocytes and B cells but mild in T cells, fibroblasts, platelets and erythrocytes, hence the lack of intravascular haemolysis.The transcriptional events underlying differential expression of GPI in the haematopoietic cells of PIG-M-associated IGD are not known and constitute the general aim of this thesis. Firstly, I found that PIGM mRNA levels are variable amongst normal primary haematopoietic cells. In addition, the nucleosome configuration in the promoter of PIGM is more compacted in B cells than in erythroid cells and this correlated with the levels of PIGM mRNA expression, i.e., lower in B cells. The presence of several binding sites for GATA-1, a mega-erythroid lineage-specific transcription factor (TF), at the PIGM promoter suggested that GATA-1 has a role on PIGM transcription. My results showed that GATA-1 in erythroid cells is most likely a repressor rather than an activator of PIGM expression. Preliminary data suggested that KLF1, an erythroid-specific TF, regulates PIGM transcription but independently of the -270GC motif. Secondly, investigation of the role of the Sp TFs showed that Sp1 directly mediates PIGM transcriptional regulation in both B and erythroid cells. However, unlike in B cells in which active PIGM transcription requires binding of the generic TF Sp1 to the -270GC-rich box, in erythroid cells, Sp1 regulates PIGM transcription by binding upstream of but not to the -270GC-rich motif. Additionally, I showed that Sp2 is not a direct regulator of PIGM transcription in B and erythroid cells. These findings explain lack of intravascular haemolysis in PIGM-associated IGD, a defining feature of PNH. Lastly, preliminary work shows that transcriptional repression of PIG-M by the pathogenic -270C>G mutation is associated with reduced frequency of in cis genomic interactions between PIGM and its neighbouring genes, suggesting a shared regulatory link between these genes and PIGM. Altogether, the results presented in this thesis provide novel insights into tissuespecific transcriptional control of a housekeeping gene by lineage-specific and generic TFs.
Resumo:
In the last few years, we have observed an exponential increasing of the information systems, and parking information is one more example of them. The needs of obtaining reliable and updated information of parking slots availability are very important in the goal of traffic reduction. Also parking slot prediction is a new topic that has already started to be applied. San Francisco in America and Santander in Spain are examples of such projects carried out to obtain this kind of information. The aim of this thesis is the study and evaluation of methodologies for parking slot prediction and the integration in a web application, where all kind of users will be able to know the current parking status and also future status according to parking model predictions. The source of the data is ancillary in this work but it needs to be understood anyway to understand the parking behaviour. Actually, there are many modelling techniques used for this purpose such as time series analysis, decision trees, neural networks and clustering. In this work, the author explains the best techniques at this work, analyzes the result and points out the advantages and disadvantages of each one. The model will learn the periodic and seasonal patterns of the parking status behaviour, and with this knowledge it can predict future status values given a date. The data used comes from the Smart Park Ontinyent and it is about parking occupancy status together with timestamps and it is stored in a database. After data acquisition, data analysis and pre-processing was needed for model implementations. The first test done was with the boosting ensemble classifier, employed over a set of decision trees, created with C5.0 algorithm from a set of training samples, to assign a prediction value to each object. In addition to the predictions, this work has got measurements error that indicates the reliability of the outcome predictions being correct. The second test was done using the function fitting seasonal exponential smoothing tbats model. Finally as the last test, it has been tried a model that is actually a combination of the previous two models, just to see the result of this combination. The results were quite good for all of them, having error averages of 6.2, 6.6 and 5.4 in vacancies predictions for the three models respectively. This means from a parking of 47 places a 10% average error in parking slot predictions. This result could be even better with longer data available. In order to make this kind of information visible and reachable from everyone having a device with internet connection, a web application was made for this purpose. Beside the data displaying, this application also offers different functions to improve the task of searching for parking. The new functions, apart from parking prediction, were: - Park distances from user location. It provides all the distances to user current location to the different parks in the city. - Geocoding. The service for matching a literal description or an address to a concrete location. - Geolocation. The service for positioning the user. - Parking list panel. This is not a service neither a function, is just a better visualization and better handling of the information.
Resumo:
No atual contexto da inovação, um grande número de estudos tem analisado o potencial do modelo de Inovação Aberta. Neste sentido, o autor Henry Chesbrough (2003) considerado o pai da Inovação Aberta, afirma que as empresas estão vivenciando uma “mudança de paradigma” na maneira como desenvolvem os seus processos de inovação e na comercialização de tecnologia e conhecimento. Desta forma, o modelo de Inovação Aberta defende que as empresas podem e devem utilizar os recursos disponíveis fora das suas fronteiras sendo esta combinação de ideias e tecnologias internas e externas crucial para atingir uma posição de liderança no mercado. Já afirmava Chesbrough (2003) que não se faz inovação isoladamente e o próprio dinamismo do cenário atual reforça esta ideia. Assim, os riscos inerentes ao processo de inovação podem ser atenuados através da realização de parcerias entre empresas e instituições. A adoção do modelo de Inovação Aberta é percebida com base na abundância de conhecimento disponível, que poderá proporcionar valor também à empresa que o criou, como é o caso do licenciamento de patentes. O presente estudo teve como objetivo identificar as práticas de Inovação Aberta entre as parcerias mencionadas pelas empresas prestadoras de Cloud Computing. Através da Análise de Redes Sociais foram construídas matrizes referentes às parcerias mencionadas pelas empresas e informações obtidas em fontes secundárias (Sousa, 2012). Essas matrizes de relacionamento (redes) foram analisadas e representadas através de diagramas. Desta forma, foi possível traçar um panorama das parcerias consideradas estratégicas pelas empresas entrevistadas e identificar quais delas constituem, de fato, práticas de Inovação Aberta. Do total de 26 parcerias estratégicas mencionadas nas entrevistas, apenas 11 foram caracterizadas como práticas do modelo aberto. A análise das práticas conduzidas pelas empresas entrevistadas permite verificar algumas limitações no aproveitamento do modelo de Inovação Aberta. Por fim, são feitas algumas recomendações sobre a implementação deste modelo pelas pequenas e médias empresas baseadas em tecnologias emergentes, como é o caso do conceito de cloud computing.
Resumo:
This work models the competitive behaviour of individuals who maximize their own utility managing their network of connections with other individuals. Utility is taken as a synonym of reputation in this model. Each agent has to decide between two variables: the quality of connections and the number of connections. Hence, the reputation of an individual is a function of the number and the quality of connections within the network. On the other hand, individuals incur in a cost when they improve their network of contacts. The initial value of the quality and number of connections of each individual is distributed according to an initial (given) distribution. The competition occurs over continuous time and among a continuum of agents. A mean field game approach is adopted to solve the model, leading to an optimal trajectory for the number and quality of connections for each individual.
Resumo:
This study discusses some fundamental issues so that the development and diffusion of services based in cloud computing happen positively in several countries. For exposure of this subject is discusses public initiatives by the most advanced countries in terms of cloud computing application and the brazilin position in this context. Based on presented evidences here it appears that the essential elements for the development and diffusion of cloud computing in Brazil made important steps and show evidence of maturity, as the cybercrime legislation. However, other elements still require analysis and specifically adaptations for the cloud computing case, such as the Intellectual Property Rights. Despite showing broadband services still lacking, one cannot disregard the government effort to facilitate access for all society. In contrast, the large volume of the Brazilian IT market is an interest factor for companies seeking to invest in the country.
Resumo:
This study focuses on the implementation of several pair trading strategies across three emerging markets, with the objective of comparing the results obtained from the different strategies and assessing if pair trading benefits from a more volatile environment. The results show that, indeed, there are higher potential profits arising from emerging markets. However, the higher excess return will be partially offset by higher transaction costs, which will be a determinant factor to the profitability of pair trading strategies. Also, a new clustering approach based on the Principal Component Analysis was tested as an alternative to the more standard clustering by Industry Groups. The new clustering approach delivers promising results, consistently reducing volatility to a greater extent than the Industry Group approach, with no significant harm to the excess returns.
Resumo:
This paper develops the model of Bicego, Grosso, and Otranto (2008) and applies Hidden Markov Models to predict market direction. The paper draws an analogy between financial markets and speech recognition, seeking inspiration from the latter to solve common issues in quantitative investing. Whereas previous works focus mostly on very complex modifications of the original hidden markov model algorithm, the current paper provides an innovative methodology by drawing inspiration from thoroughly tested, yet simple, speech recognition methodologies. By grouping returns into sequences, Hidden Markov Models can then predict market direction the same way they are used to identify phonemes in speech recognition. The model proves highly successful in identifying market direction but fails to consistently identify whether a trend is in place. All in all, the current paper seeks to bridge the gap between speech recognition and quantitative finance and, even though the model is not fully successful, several refinements are suggested and the room for improvement is significant.
Resumo:
This research addresses the problem of creating interactive experiences to encourage people to explore spaces. Besides the obvious spaces to visit, such as museums or art galleries, spaces that people visit can be, for example, a supermarket or a restaurant. As technology evolves, people become more demanding in the way they use it and expect better forms of interaction with the space that surrounds them. Interaction with the space allows information to be transmitted to the visitors in a friendly way, leading visitors to explore it and gain knowledge. Systems to provide better experiences while exploring spaces demand hardware and software that is not in the reach of every space owner either because of the cost or inconvenience of the installation, that can damage artefacts or the space environment. We propose a system adaptable to the spaces, that uses a video camera network and a wi-fi network present at the space (or that can be installed) to provide means to support interactive experiences using the visitor’s mobile device. The system is composed of an infrastructure (called vuSpot), a language grammar used to describe interactions at a space (called XploreDescription), a visual tool used to design interactive experiences (called XploreBuilder) and a tool used to create interactive experiences (called urSpace). By using XploreBuilder, a tool built of top of vuSpot, a user with little or no experience in programming can define a space and design interactive experiences. This tool generates a description of the space and of the interactions at that space (that complies with the XploreDescription grammar). These descriptions can be given to urSpace, another tool built of top of vuSpot, that creates the interactive experience application. With this system we explore new forms of interaction and use mobile devices and pico projectors to deliver additional information to the users leading to the creation of interactive experiences. The several components are presented as well as the results of the respective user tests, which were positive. The design and implementation becomes cheaper, faster, more flexible and, since it does not depend on the knowledge of a programming language, accessible for the general public.
Resumo:
In the following text I will develop three major aspects. The first is to draw attention to those who seem to have been the disciplinary fields where, despite everything, the Digital Humanities (in the broad perspective as will be regarded here) have asserted themselves in a more comprehensive manner. I think it is here that I run into greater risks, not only for what I have mentioned above, but certainly because a significant part, perhaps, of the achievements and of the researchers might have escaped the look that I sought to cast upon the past few decades, always influenced by my own experience and the work carried out in the field of History. But this can be considered as a work in progress and it is open to criticism and suggestions. A second point to note is that emphasis will be given to the main lines of development in the relationship between historical research and digital methodologies, resources and tools. Finally, I will try to make a brief analysis of what has been the Digital Humanities discourse appropriation in recent years, with very debatable data and methods for sure, because studies are still scarce and little systematic information is available that would allow to go beyond an introductory reflection.