881 resultados para mining algorithm


Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE: Most RB1 mutations are unique and distributed throughout the RB1 gene. Their detection can be time-consuming and the yield especially low in cases of conservatively-treated sporadic unilateral retinoblastoma (Rb) patients. In order to identify patients with true risk of developing Rb, and to reduce the number of unnecessary examinations under anesthesia in all other cases, we developed a universal sensitive, efficient and cost-effective strategy based on intragenic haplotype analysis. METHODS: This algorithm allows the calculation of the a posteriori risk of developing Rb and takes into account (a) RB1 loss of heterozygosity in tumors, (b) preferential paternal origin of new germline mutations, (c) a priori risk derived from empirical data by Vogel, and (d) disease penetrance of 90% in most cases. We report the occurrence of Rb in first degree relatives of patients with sporadic Rb who visited the Jules Gonin Eye Hospital, Lausanne, Switzerland, from January 1994 to December 2006 compared to expected new cases of Rb using our algorithm. RESULTS: A total of 134 families with sporadic Rb were enrolled; testing was performed in 570 individuals and 99 patients younger than 4 years old were identified. We observed one new case of Rb. Using our algorithm, the cumulated total a posteriori risk of recurrence was 1.77. CONCLUSIONS: This is the first time that linkage analysis has been validated to monitor the risk of recurrence in sporadic Rb. This should be a useful tool in genetic counseling, especially when direct RB1 screening for mutations leaves a negative result or is unavailable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The care for a patient with ulcerative colitis (UC) remains challenging despite the fact that morbidity and mortality rates have been considerably reduced during the last 30 years. The traditional management with intravenous corticosteroids was modified by the introduction of ciclosporin and infliximab. In this review, we focus on the treatment of patients with moderate to severe UC. Four typical clinical scenarios are defined and discussed in detail. The treatment recommendations are based on current literature, published guidelines and reviews, and were discussed at a consensus meeting of Swiss experts in the field. Comprehensive treatment algorithms were developed, aimed for daily clinical practice.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we are proposing a methodology to determine the most efficient and least costly way of crew pairing optimization. We are developing a methodology based on algorithm optimization on Eclipse opensource IDE using the Java programming language to solve the crew scheduling problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Descriptors based on Molecular Interaction Fields (MIF) are highly suitable for drug discovery, but their size (thousands of variables) often limits their application in practice. Here we describe a simple and fast computational method that extracts from a MIF a handful of highly informative points (hot spots) which summarize the most relevant information. The method was specifically developed for drug discovery, is fast, and does not require human supervision, being suitable for its application on very large series of compounds. The quality of the results has been tested by running the method on the ligand structure of a large number of ligand-receptor complexes and then comparing the position of the selected hot spots with actual atoms of the receptor. As an additional test, the hot spots obtained with the novel method were used to obtain GRIND-like molecular descriptors which were compared with the original GRIND. In both cases the results show that the novel method is highly suitable for describing ligand-receptor interactions and compares favorably with other state-of-the-art methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A systolic array to implement lattice-reduction-aided lineardetection is proposed for a MIMO receiver. The lattice reductionalgorithm and the ensuing linear detections are operated in the same array, which can be hardware-efficient. All-swap lattice reduction algorithm (ASLR) is considered for the systolic design.ASLR is a variant of the LLL algorithm, which processes all lattice basis vectors within one iteration. Lattice-reduction-aided linear detection based on ASLR and LLL algorithms have very similarbit-error-rate performance, while ASLR is more time efficient inthe systolic array, especially for systems with a large number ofantennas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition,production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks of PANACEA. The CAC, which is the first stage in the PANACEA pipeline for building Language Resources, adopts an efficient and distributed methodology to crawl for web documents with rich textual content in specific languages and predefined domains. The CAC includes modules that can acquire parallel data from sites with in-domain content available in more than one language. In order to extrinsically evaluate the CAC methodology, we have conducted several experiments that used crawled parallel corpora for the identification and extraction of parallel sentences using sentence alignment. The corpora were then successfully used for domain adaptation of Machine Translation Systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic creation of polarity lexicons is a crucial issue to be solved in order to reduce time andefforts in the first steps of Sentiment Analysis. In this paper we present a methodology based onlinguistic cues that allows us to automatically discover, extract and label subjective adjectivesthat should be collected in a domain-based polarity lexicon. For this purpose, we designed abootstrapping algorithm that, from a small set of seed polar adjectives, is capable to iterativelyidentify, extract and annotate positive and negative adjectives. Additionally, the methodautomatically creates lists of highly subjective elements that change their prior polarity evenwithin the same domain. The algorithm proposed reached a precision of 97.5% for positiveadjectives and 71.4% for negative ones in the semantic orientation identification task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

From a managerial point of view, the more effcient, simple, and parameter-free (ESP) an algorithm is, the more likely it will be used in practice for solving real-life problems. Following this principle, an ESP algorithm for solving the Permutation Flowshop Sequencing Problem (PFSP) is proposed in this article. Using an Iterated Local Search (ILS) framework, the so-called ILS-ESP algorithm is able to compete in performance with other well-known ILS-based approaches, which are considered among the most effcient algorithms for the PFSP. However, while other similar approaches still employ several parameters that can affect their performance if not properly chosen, our algorithm does not require any particular fine-tuning process since it uses basic "common sense" rules for the local search, perturbation, and acceptance criterion stages of the ILS metaheuristic. Our approach defines a new operator for the ILS perturbation process, a new acceptance criterion based on extremely simple and transparent rules, and a biased randomization process of the initial solution to randomly generate different alternative initial solutions of similar quality -which is attained by applying a biased randomization to a classical PFSP heuristic. This diversification of the initial solution aims at avoiding poorly designed starting points and, thus, allows the methodology to take advantage of current trends in parallel and distributed computing. A set of extensive tests, based on literature benchmarks, has been carried out in order to validate our algorithm and compare it against other approaches. These tests show that our parameter-free algorithm is able to compete with state-of-the-art metaheuristics for the PFSP. Also, the experiments show that, when using parallel computing, it is possible to improve the top ILS-based metaheuristic by just incorporating to it our biased randomization process with a high-quality pseudo-random number generator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

O presente trabalho cujo Título é técnicas de Data e Text Mining para a anotação dum Arquivo Digital, tem como objectivo testar a viabilidade da utilização de técnicas de processamento automático de texto para a anotação das sessões dos debates parlamentares da Assembleia da República de Portugal. Ao longo do trabalho abordaram-se conceitos como tecnologias de descoberta do conhecimento (KDD), o processo da descoberta do conhecimento em texto, a caracterização das várias etapas do processamento de texto e a descrição de algumas ferramentas open souce para a mineração de texto. A metodologia utilizada baseou-se na experimentação de várias técnicas de processamento textual utilizando a open source R/tm. Apresentam-se, como resultados, a influência do pré-processamento, tamanho dos documentos e tamanhos dos corpora no resultado do processamento utilizando o algoritmo knnflex.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The standard one-machine scheduling problem consists in schedulinga set of jobs in one machine which can handle only one job at atime, minimizing the maximum lateness. Each job is available forprocessing at its release date, requires a known processing timeand after finishing the processing, it is delivery after a certaintime. There also can exists precedence constraints between pairsof jobs, requiring that the first jobs must be completed beforethe second job can start. An extension of this problem consistsin assigning a time interval between the processing of the jobsassociated with the precedence constrains, known by finish-starttime-lags. In presence of this constraints, the problem is NP-hardeven if preemption is allowed. In this work, we consider a specialcase of the one-machine preemption scheduling problem with time-lags, where the time-lags have a chain form, and propose apolynomial algorithm to solve it. The algorithm consist in apolynomial number of calls of the preemption version of the LongestTail Heuristic. One of the applicability of the method is to obtainlower bounds for NP-hard one-machine and job-shop schedulingproblems. We present some computational results of thisapplication, followed by some conclusions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a Pyramidal Classification Algorithm,which together with an appropriate aggregation index producesan indexed pseudo-hierarchy (in the strict sense) withoutinversions nor crossings. The computer implementation of thealgorithm makes it possible to carry out some simulation testsby Monte Carlo methods in order to study the efficiency andsensitivity of the pyramidal methods of the Maximum, Minimumand UPGMA. The results shown in this paper may help to choosebetween the three classification methods proposed, in order toobtain the classification that best fits the original structureof the population, provided we have an a priori informationconcerning this structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if thesequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este trabalho foi realizado no âmbito do regulamento dos cursos de graduação da Universidade Jean Piaget de Cabo Verde, procura realçar a importância da recolha de dados na Web nos dias de hoje. Também apresenta um CMS (Sistema de Gestão de Conteúdo) utilizado no desenvolvimento de Websites, mostrando que é possível obter dados que podem ser considerados úteis acerca do acesso e utilização dos mesmos, dotando-os de componentes desenvolvidos para estes sistemas.