993 resultados para Probabilistic Algorithms
Resumo:
Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.
Resumo:
In this thesis we attempt to make a probabilistic analysis of some physically realizable, though complex, storage and queueing models. It is essentially a mathematical study of the stochastic processes underlying these models. Our aim is to have an improved understanding of the behaviour of such models, that may widen their applicability. Different inventory systems with randon1 lead times, vacation to the server, bulk demands, varying ordering levels, etc. are considered. Also we study some finite and infinite capacity queueing systems with bulk service and vacation to the server and obtain the transient solution in certain cases. Each chapter in the thesis is provided with self introduction and some important references
Resumo:
Extensive use of the Internet coupled with the marvelous growth in e-commerce and m-commerce has created a huge demand for information security. The Secure Socket Layer (SSL) protocol is the most widely used security protocol in the Internet which meets this demand. It provides protection against eaves droppings, tampering and forgery. The cryptographic algorithms RC4 and HMAC have been in use for achieving security services like confidentiality and authentication in the SSL. But recent attacks against RC4 and HMAC have raised questions in the confidence on these algorithms. Hence two novel cryptographic algorithms MAJE4 and MACJER-320 have been proposed as substitutes for them. The focus of this work is to demonstrate the performance of these new algorithms and suggest them as dependable alternatives to satisfy the need of security services in SSL. The performance evaluation has been done by using practical implementation method.
Resumo:
An Overview of known spatial clustering algorithms The space of interest can be the two-dimensional abstraction of the surface of the earth or a man-made space like the layout of a VLSI design, a volume containing a model of the human brain, or another 3d-space representing the arrangement of chains of protein molecules. The data consists of geometric information and can be either discrete or continuous. The explicit location and extension of spatial objects define implicit relations of spatial neighborhood (such as topological, distance and direction relations) which are used by spatial data mining algorithms. Therefore, spatial data mining algorithms are required for spatial characterization and spatial trend analysis. Spatial data mining or knowledge discovery in spatial databases differs from regular data mining in analogous with the differences between non-spatial data and spatial data. The attributes of a spatial object stored in a database may be affected by the attributes of the spatial neighbors of that object. In addition, spatial location, and implicit information about the location of an object, may be exactly the information that can be extracted through spatial data mining
Resumo:
In this paper we address the problem of face detection and recognition of grey scale frontal view images. We propose a face recognition system based on probabilistic neural networks (PNN) architecture. The system is implemented using voronoi/ delaunay tessellations and template matching. Images are segmented successfully into homogeneous regions by virtue of voronoi diagram properties. Face verification is achieved using matching scores computed by correlating edge gradients of reference images. The advantage of classification using PNN models is its short training time. The correlation based template matching guarantees good classification results
Resumo:
n this paper we address the problem of face detection and recognition of grey scale frontal view images. We propose a face recognition system based on probabilistic neural networks (PNN) architecture. The system is implemented using voronoi/ delaunay tessellations and template matching. Images are segmented successfully into homogeneous regions by virtue of voronoi diagram properties. Face verification is achieved using matching scores computed by correlating edge gradients of reference images. The advantage of classification using PNN models is its short training time. The correlation based template matching guarantees good classification results.
Resumo:
We develop several algorithms for computations in Galois extensions of p-adic fields. Our algorithms are based on existing algorithms for number fields and are exact in the sense that we do not need to consider approximations to p-adic numbers. As an application we describe an algorithmic approach to prove or disprove various conjectures for local and global epsilon constants.
Resumo:
Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.
Resumo:
In this report, we discuss the application of global optimization and Evolutionary Computation to distributed systems. We therefore selected and classified many publications, giving an insight into the wide variety of optimization problems which arise in distributed systems. Some interesting approaches from different areas will be discussed in greater detail with the use of illustrative examples.
Resumo:
Distributed systems are one of the most vital components of the economy. The most prominent example is probably the internet, a constituent element of our knowledge society. During the recent years, the number of novel network types has steadily increased. Amongst others, sensor networks, distributed systems composed of tiny computational devices with scarce resources, have emerged. The further development and heterogeneous connection of such systems imposes new requirements on the software development process. Mobile and wireless networks, for instance, have to organize themselves autonomously and must be able to react to changes in the environment and to failing nodes alike. Researching new approaches for the design of distributed algorithms may lead to methods with which these requirements can be met efficiently. In this thesis, one such method is developed, tested, and discussed in respect of its practical utility. Our new design approach for distributed algorithms is based on Genetic Programming, a member of the family of evolutionary algorithms. Evolutionary algorithms are metaheuristic optimization methods which copy principles from natural evolution. They use a population of solution candidates which they try to refine step by step in order to attain optimal values for predefined objective functions. The synthesis of an algorithm with our approach starts with an analysis step in which the wanted global behavior of the distributed system is specified. From this specification, objective functions are derived which steer a Genetic Programming process where the solution candidates are distributed programs. The objective functions rate how close these programs approximate the goal behavior in multiple randomized network simulations. The evolutionary process step by step selects the most promising solution candidates and modifies and combines them with mutation and crossover operators. This way, a description of the global behavior of a distributed system is translated automatically to programs which, if executed locally on the nodes of the system, exhibit this behavior. In our work, we test six different ways for representing distributed programs, comprising adaptations and extensions of well-known Genetic Programming methods (SGP, eSGP, and LGP), one bio-inspired approach (Fraglets), and two new program representations called Rule-based Genetic Programming (RBGP, eRBGP) designed by us. We breed programs in these representations for three well-known example problems in distributed systems: election algorithms, the distributed mutual exclusion at a critical section, and the distributed computation of the greatest common divisor of a set of numbers. Synthesizing distributed programs the evolutionary way does not necessarily lead to the envisaged results. In a detailed analysis, we discuss the problematic features which make this form of Genetic Programming particularly hard. The two Rule-based Genetic Programming approaches have been developed especially in order to mitigate these difficulties. In our experiments, at least one of them (eRBGP) turned out to be a very efficient approach and in most cases, was superior to the other representations.
Resumo:
In dieser Arbeit werden Algorithmen zur Untersuchung der äquivarianten Tamagawazahlvermutung von Burns und Flach entwickelt. Zunächst werden Algorithmen angegeben mit denen die lokale Fundamentalklasse, die globale Fundamentalklasse und Tates kanonische Klasse berechnet werden können. Dies ermöglicht unter anderem Berechnungen in Brauergruppen von Zahlkörpererweiterungen. Anschließend werden diese Algorithmen auf die Tamagawazahlvermutung angewendet. Die Epsilonkonstantenvermutung kann dadurch für alle Galoiserweiterungen L|K bewiesen werden, bei denen L in einer Galoiserweiterung E|Q vom Grad kleiner gleich 15 eingebettet werden kann. Für die Tamagawazahlvermutung an der Stelle 1 wird ein Algorithmus angegeben, der die Vermutung für ein gegebenes Fallbeispiel L|Q numerischen verifizieren kann. Im Spezialfall, dass alle Charaktere rational oder abelsch sind, kann dieser Algorithmus die Vermutung für L|Q sogar beweisen.
Resumo:
In dieser Dissertation werden Methoden zur optimalen Aufgabenverteilung in Multirobotersystemen (engl. Multi-Robot Task Allocation – MRTA) zur Inspektion von Industrieanlagen untersucht. MRTA umfasst die Verteilung und Ablaufplanung von Aufgaben für eine Gruppe von Robotern unter Berücksichtigung von operativen Randbedingungen mit dem Ziel, die Gesamteinsatzkosten zu minimieren. Dank zunehmendem technischen Fortschritt und sinkenden Technologiekosten ist das Interesse an mobilen Robotern für den Industrieeinsatz in den letzten Jahren stark gestiegen. Viele Arbeiten konzentrieren sich auf Probleme der Mobilität wie Selbstlokalisierung und Kartierung, aber nur wenige Arbeiten untersuchen die optimale Aufgabenverteilung. Da sich mit einer guten Aufgabenverteilung eine effizientere Planung erreichen lässt (z. B. niedrigere Kosten, kürzere Ausführungszeit), ist das Ziel dieser Arbeit die Entwicklung von Lösungsmethoden für das aus Inspektionsaufgaben mit Einzel- und Zweiroboteraufgaben folgende Such-/Optimierungsproblem. Ein neuartiger hybrider Genetischer Algorithmus wird vorgestellt, der einen teilbevölkerungbasierten Genetischen Algorithmus zur globalen Optimierung mit lokalen Suchheuristiken kombiniert. Zur Beschleunigung dieses Algorithmus werden auf die fittesten Individuen einer Generation lokale Suchoperatoren angewendet. Der vorgestellte Algorithmus verteilt die Aufgaben nicht nur einfach und legt den Ablauf fest, sondern er bildet auch temporäre Roboterverbünde für Zweiroboteraufgaben, wodurch räumliche und zeitliche Randbedingungen entstehen. Vier alternative Kodierungsstrategien werden für den vorgestellten Algorithmus entworfen: Teilaufgabenbasierte Kodierung: Hierdurch werden alle möglichen Lösungen abgedeckt, allerdings ist der Suchraum sehr groß. Aufgabenbasierte Kodierung: Zwei Möglichkeiten zur Zuweisung von Zweiroboteraufgaben wurden implementiert, um die Effizienz des Algorithmus zu steigern. Gruppierungsbasierte Kodierung: Zeitliche Randbedingungen zur Gruppierung von Aufgaben werden vorgestellt, um gute Lösungen innerhalb einer kleinen Anzahl von Generationen zu erhalten. Zwei Umsetzungsvarianten werden vorgestellt. Dekompositionsbasierte Kodierung: Drei geometrische Zerlegungen wurden entworfen, die Informationen über die räumliche Anordnung ausnutzen, um Probleme zu lösen, die Inspektionsgebiete mit rechteckigen Geometrien aufweisen. In Simulationsstudien wird die Leistungsfähigkeit der verschiedenen hybriden Genetischen Algorithmen untersucht. Dazu wurde die Inspektion von Tanklagern einer Erdölraffinerie mit einer Gruppe homogener Inspektionsroboter als Anwendungsfall gewählt. Die Simulationen zeigen, dass Kodierungsstrategien, die auf der geometrischen Zerlegung basieren, bei einer kleinen Anzahl an Generationen eine bessere Lösung finden können als die anderen untersuchten Strategien. Diese Arbeit beschäftigt sich mit Einzel- und Zweiroboteraufgaben, die entweder von einem einzelnen mobilen Roboter erledigt werden können oder die Zusammenarbeit von zwei Robotern erfordern. Eine Erweiterung des entwickelten Algorithmus zur Behandlung von Aufgaben, die mehr als zwei Roboter erfordern, ist möglich, würde aber die Komplexität der Optimierungsaufgabe deutlich vergrößern.
Resumo:
This thesis develops a model for the topological structure of situations. In this model, the topological structure of space is altered by the presence or absence of boundaries, such as those at the edges of objects. This allows the intuitive meaning of topological concepts such as region connectivity, function continuity, and preservation of topological structure to be modeled using the standard mathematical definitions. The thesis shows that these concepts are important in a wide range of artificial intelligence problems, including low-level vision, high-level vision, natural language semantics, and high-level reasoning.
Resumo:
Robots must act purposefully and successfully in an uncertain world. Sensory information is inaccurate or noisy, actions may have a range of effects, and the robot's environment is only partially and imprecisely modeled. This thesis introduces active randomization by a robot, both in selecting actions to execute and in focusing on sensory information to interpret, as a basic tool for overcoming uncertainty. An example of randomization is given by the strategy of shaking a bin containing a part in order to orient the part in a desired stable state with some high probability. Another example consists of first using reliable sensory information to bring two parts close together, then relying on short random motions to actually mate the two parts, once the part motions lie below the available sensing resolution. Further examples include tapping parts that are tightly wedged, twirling gears before trying to mesh them, and vibrating parts to facilitate a mating operation.
Resumo:
The work described in this thesis began as an inquiry into the nature and use of optimization programs based on "genetic algorithms." That inquiry led, eventually, to three powerful heuristics that are broadly applicable in gradient-ascent programs: First, remember the locations of local maxima and restart the optimization program at a place distant from previously located local maxima. Second, adjust the size of probing steps to suit the local nature of the terrain, shrinking when probes do poorly and growing when probes do well. And third, keep track of the directions of recent successes, so as to probe preferentially in the direction of most rapid ascent. These algorithms lie at the core of a novel optimization program that illustrates the power to be had from deploying them together. The efficacy of this program is demonstrated on several test problems selected from a variety of fields, including De Jong's famous test-problem suite, the traveling salesman problem, the problem of coordinate registration for image guided surgery, the energy minimization problem for determining the shape of organic molecules, and the problem of assessing the structure of sedimentary deposits using seismic data.