960 resultados para Automatic Gridding of microarray images


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the followed methodology to automatically generate titles for a corpus of questions that belong to sociological opinion polls. Titles for questions have a twofold function: (1) they are the input of user searches and (2) they inform about the whole contents of the question and possible answer options. Thus, generation of titles can be considered as a case of automatic summarization. However, the fact that summarization had to be performed over very short texts together with the aforementioned quality conditions imposed on new generated titles led the authors to follow knowledge-rich and domain-dependent strategies for summarization, disregarding the more frequent extractive techniques for summarization.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose an unsupervised methodology to automatically discover pairs of semantically related words by highlighting their local environment and evaluating their semantic similarity in local and global semantic spaces. This proposal di®ers from previous research as it tries to take the best of two different methodologies i.e. semantic space models and information extraction models. It can be applied to extract close semantic relations, it limits the search space and it is unsupervised.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Traditional Optics has provided ways to compensate some common visual limitations (up to second order visual impairments) through spectacles or contact lenses. Recent developments in wavefront science make it possible to obtain an accurate model of the Point Spread Function (PSF) of the human eye. Through what is known as the "Wavefront Aberration Function" of the human eye, exact knowledge of the optical aberration of the human eye is possible, allowing a mathematical model of the PSF to be obtained. This model could be used to pre-compensate (inverse-filter) the images displayed on computer screens in order to counter the distortion in the user's eye. This project takes advantage of the fact that the wavefront aberration function, commonly expressed as a Zernike polynomial, can be generated from the ophthalmic prescription used to fit spectacles to a person. This allows the pre-compensation, or onscreen deblurring, to be done for various visual impairments, up to second order (commonly known as myopia, hyperopia, or astigmatism). The technique proposed towards that goal and results obtained using a lens, for which the PSF is known, that is introduced into the visual path of subjects without visual impairment will be presented. In addition to substituting the effect of spectacles or contact lenses in correcting the loworder visual limitations of the viewer, the significance of this approach is that it has the potential to address higher-order abnormalities in the eye, currently not correctable by simple means.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Category hierarchy is an abstraction mechanism for efficiently managing large-scale resources. In an open environment, a category hierarchy will inevitably become inappropriate for managing resources that constantly change with unpredictable pattern. An inappropriate category hierarchy will mislead the management of resources. The increasing dynamicity and scale of online resources increase the requirement of automatically maintaining category hierarchy. Previous studies about category hierarchy mainly focus on either the generation of category hierarchy or the classification of resources under a pre-defined category hierarchy. The automatic maintenance of category hierarchy has been neglected. Making abstraction among categories and measuring the similarity between categories are two basic behaviours to generate a category hierarchy. Humans are good at making abstraction but limited in ability to calculate the similarities between large-scale resources. Computing models are good at calculating the similarities between large-scale resources but limited in ability to make abstraction. To take both advantages of human view and computing ability, this paper proposes a two-phase approach to automatically maintaining category hierarchy within two scales by detecting the internal pattern change of categories. The global phase clusters resources to generate a reference category hierarchy and gets similarity between categories to detect inappropriate categories in the initial category hierarchy. The accuracy of the clustering approaches in generating category hierarchy determines the rationality of the global maintenance. The local phase detects topical changes and then adjusts inappropriate categories with three local operations. The global phase can quickly target inappropriate categories top-down and carry out cross-branch adjustment, which can also accelerate the local-phase adjustments. The local phase detects and adjusts the local-range inappropriate categories that are not adjusted in the global phase. By incorporating the two complementary phase adjustments, the approach can significantly improve the topical cohesion and accuracy of category hierarchy. A new measure is proposed for evaluating category hierarchy considering not only the balance of the hierarchical structure but also the accuracy of classification. Experiments show that the proposed approach is feasible and effective to adjust inappropriate category hierarchy. The proposed approach can be used to maintain the category hierarchy for managing various resources in dynamic application environment. It also provides an approach to specialize the current online category hierarchy to organize resources with more specific categories.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overall aim of our research is to develop a clinical information retrieval system that retrieves systematic reviews and underlying clinical studies from the Cochrane Library to support physician decision making. We believe that in order to accomplish this goal we need to develop a mechanism for effectively representing documents that will be retrieved by the application. Therefore, as a first step in developing the retrieval application we have developed a methodology that semi-automatically generates high quality indices and applies them as descriptors to documents from The Cochrane Library. In this paper we present a description and implementation of the automatic indexing methodology and an evaluation that demonstrates that enhanced document representation results in the retrieval of relevant documents for clinical queries. We argue that the evaluation of information retrieval applications should also include an evaluation of the quality of the representation of documents that may be retrieved. ©2010 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a solution to part of the problem of making robotic or semi-robotic digging equipment less dependant on human supervision. A method is described for identifying rocks of a certain size that may affect digging efficiency or require special handling. The process involves three main steps. First, by using range and intensity data from a time-of-flight (TOF) camera, a feature descriptor is used to rank points and separate regions surrounding high scoring points. This allows a wide range of rocks to be recognized because features can represent a whole or just part of a rock. Second, these points are filtered to extract only points thought to belong to the large object. Finally, a check is carried out to verify that the resultant point cloud actually represents a rock. Results are presented from field testing on piles of fragmented rock. Note to Practitioners—This paper presents an algorithm to identify large boulders in a pile of broken rock as a step towards an autonomous mining dig planner. In mining, piles of broken rock can contain large fragments that may need to be specially handled. To assess rock piles for excavation, we make use of a TOF camera that does not rely on external lighting to generate a point cloud of the rock pile. We then segment large boulders from its surface by using a novel feature descriptor and distinguish between real and false boulder candidates. Preliminary field experiments show promising results with the algorithm performing nearly as well as human test subjects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

By examining the pictorial content of the veintena section of the Primeros Memoriales, the manusctipt compiled by fray Bernardino de Sahagún, I identify new pieces of evidence on the origin of these illustrations and their authors. A carefull analisis of this material suggests that it is strongly embedded in the pre-Hispanic tradition and that it is doubtful that their iconographic sources originated in Tepeopolco, as it is widely believed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The availability of CFD software that can easily be used and produce high efficiency on a wide range of parallel computers is extremely limited. The investment and expertise required to parallelise a code can be enormous. In addition, the cost of supercomputers forces high utilisation to justify their purchase, requiring a wide range of software. To break this impasse, tools are urgently required to assist in the parallelisation process that dramatically reduce the parallelisation time but do not degrade the performance of the resulting parallel software. In this paper we discuss enhancements to the Computer Aided Parallelisation Tools (CAPTools) to assist in the parallelisation of complex unstructured mesh-based computational mechanics codes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new dynamic load balancing technique for structured mesh computational mechanics codes in which the processor partition range limits of just one of the partitioned dimensions uses non-coincidental limits, as opposed to using coincidental limits in all of the partitioned dimensions. The partition range limits are 'staggered', allowing greater flexibility in obtaining a balanced load distribution in comparison to when the limits are changed 'globally'. as the load increase/decrease on one processor no longer restricts the load decrease/increase on a neighbouring processor. The automatic implementation of this 'staggered' load balancing strategy within an existing parallel code is presented in this paper, along with some preliminary results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Understanding transcriptional regulation by genome-wide microarray studies can contribute to unravel complex relationships between genes. Attempts to standardize the annotation of microarray data include the Minimum Information About a Microarray Experiment (MIAME) recommendations, the MAGE-ML format for data interchange, and the use of controlled vocabularies or ontologies. The existing software systems for microarray data analysis implement the mentioned standards only partially and are often hard to use and extend. Integration of genomic annotation data and other sources of external knowledge using open standards is therefore a key requirement for future integrated analysis systems. Results: The EMMA 2 software has been designed to resolve shortcomings with respect to full MAGE-ML and ontology support and makes use of modern data integration techniques. We present a software system that features comprehensive data analysis functions for spotted arrays, and for the most common synthesized oligo arrays such as Agilent, Affymetrix and NimbleGen. The system is based on the full MAGE object model. Analysis functionality is based on R and Bioconductor packages and can make use of a compute cluster for distributed services. Conclusion: Our model-driven approach for automatically implementing a full MAGE object model provides high flexibility and compatibility. Data integration via SOAP-based web-services is advantageous in a distributed client-server environment as the collaborative analysis of microarray data is gaining more and more relevance in international research consortia. The adequacy of the EMMA 2 software design and implementation has been proven by its application in many distributed functional genomics projects. Its scalability makes the current architecture suited for extensions towards future transcriptomics methods based on high-throughput sequencing approaches which have much higher computational requirements than microarrays.