852 resultados para 080109 Pattern Recognition and Data Mining


Relevância:

100.00% 100.00%

Publicador:

Resumo:

For the last decade, high-resolution (HR)-MS has been associated with qualitative analyses while triple quadrupole MS has been associated with routine quantitative analyses. However, a shift of this paradigm is taking place: quantitative and qualitative analyses will be increasingly performed by HR-MS, and it will become the common 'language' for most mass spectrometrists. Most analyses will be performed by full-scan acquisitions recording 'all' ions entering the HR-MS with subsequent construction of narrow-width extracted-ion chromatograms. Ions will be available for absolute quantification, profiling and data mining. In parallel to quantification, metabotyping will be the next step in clinical LC-MS analyses because it should help in personalized medicine. This article is aimed to help analytical chemists who perform targeted quantitative acquisitions with triple quadrupole MS make the transition to quantitative and qualitative analyses using HR-MS. Guidelines for the acceptance criteria of mass accuracy and for the determination of mass extraction windows in quantitative analyses are proposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is possible to improve the fringe binarization method of joint transform correlation by choosing a suitable threshold level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En este artículo se presenta un estudio cuya finalidad es analizar diferentes aspectos de la utilización de pantallas de cristal líquido, extraídas de un videoproyector, en montajes de reconocimiento de formas por correlación. Se analizan las condiciones de funcionamiento de las pantallas y sus posibles modos de configuración. Se estudian dos tipos de filtros de correlación, el filtro adaptado clásico y el de sólo fase, así como la manera de codificarlos en las pantallas. Finalmente, se presentan los resultados de una serie de realizaciones experimentales utilizando un correlador de VanderLugt y diferentes configuraciones de las pantallas. De todo ello se deducen las condiciones óptimas de funcionamiento del sistema.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although increasing our knowledge of the properties of networks of cities is essential, these properties can be measured at the city level, and must be assessed by analyzing actor networks. The present volume focuses less on individual characteristics and more on the interactions of actors and institutions that create functional territories in which the structure of existing links constrains emerging links. Rather than basing explanations on external factors, the goal is to determine the extent to which network properties reflect spatial distributions and create local synergies at the meso level that are incorporated into global networks at the macro level where different geographical scales occur. The paper introduces the way to use the graphs structure to identify empirically relevant groups and levels that explain dynamics. It defines what could be called âeurooemulti-levelâeuro, âeurooemulti-scaleâeuro, or âeurooemultidimensionalâeuro networks in the context of urban geography. It explains how the convergence of the network multi-territoriality paradigm collaboratively formulated, and manipulated by geographers and computer scientists produced the SPANGEO project, which is exposed in this volume.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The theory of small-world networks as initiated by Watts and Strogatz (1998) has drawn new insights in spatial analysis as well as systems theory. The theoryâeuro?s concepts and methods are particularly relevant to geography, where spatial interaction is mainstream and where interactions can be described and studied using large numbers of exchanges or similarity matrices. Networks are organized through direct links or by indirect paths, inducing topological proximities that simultaneously involve spatial, social, cultural or organizational dimensions. Network synergies build over similarities and are fed by complementarities between or inside cities, with the two effects potentially amplifying each other according to the âeurooepreferential attachmentâeuro hypothesis that has been explored in a number of different scientific fields (Barabási, Albert 1999; Barabási A-L 2002; Newman M, Watts D, Barabàsi A-L). In fact, according to Barabási and Albert (1999), the high level of hierarchy observed in âeurooescale-free networksâeuro results from âeurooepreferential attachmentâeuro, which characterizes the development of networks: new connections appear preferentially close to nodes that already have the largest number of connections because in this way, the improvement in the network accessibility of the new connection will likely be greater. However, at the same time, network regions gathering dense and numerous weak links (Granovetter, 1985) or network entities acting as bridges between several components (Burt 2005) offer a higher capacity for urban communities to benefit from opportunities and create future synergies. Several methodologies have been suggested to identify such denser and more coherent regions (also called communities or clusters) in terms of links (Watts, Strogatz 1998; Watts 1999; Barabási, Albert 1999; Barabási 2002; Auber 2003; Newman 2006). These communities not only possess a high level of dependency among their member entities but also show a low level of âeurooevulnerabilityâeuro, allowing for numerous redundancies (Burt 2000; Burt 2005). The SPANGEO project 2005âeuro"2008 (SPAtial Networks in GEOgraphy), gathering a team of geographers and computer scientists, has included empirical studies to survey concepts and measures developed in other related fields, such as physics, sociology and communication science. The relevancy and potential interpretation of weighted or non-weighted measures on edges and nodes were examined and analyzed at different scales (intra-urban, inter-urban or both). New classification and clustering schemes based on the relative local density of subgraphs were developed. The present article describes how these notions and methods contribute on a conceptual level, in terms of measures, delineations, explanatory analyses and visualization of geographical phenomena.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This master's thesis coversthe concepts of knowledge discovery, data mining and technology forecasting methods in telecommunications. It covers the various aspects of knowledge discoveryin data bases and discusses in detail the methods of data mining and technologyforecasting methods that are used in telecommunications. Main concern in the overall process of this thesis is to emphasize the methods that are being used in technology forecasting for telecommunications and data mining. It tries to answer to some extent to the question of do forecasts create a future? It also describes few difficulties that arise in technology forecasting. This thesis was done as part of my master's studies in Lappeenranta University of Technology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis deals with distance transforms which are a fundamental issue in image processing and computer vision. In this thesis, two new distance transforms for gray level images are presented. As a new application for distance transforms, they are applied to gray level image compression. The new distance transforms are both new extensions of the well known distance transform algorithm developed by Rosenfeld, Pfaltz and Lay. With some modification their algorithm which calculates a distance transform on binary images with a chosen kernel has been made to calculate a chessboard like distance transform with integer numbers (DTOCS) and a real value distance transform (EDTOCS) on gray level images. Both distance transforms, the DTOCS and EDTOCS, require only two passes over the graylevel image and are extremely simple to implement. Only two image buffers are needed: The original gray level image and the binary image which defines the region(s) of calculation. No other image buffers are needed even if more than one iteration round is performed. For large neighborhoods and complicated images the two pass distance algorithm has to be applied to the image more than once, typically 3 10 times. Different types of kernels can be adopted. It is important to notice that no other existing transform calculates the same kind of distance map as the DTOCS. All the other gray weighted distance function, GRAYMAT etc. algorithms find the minimum path joining two points by the smallest sum of gray levels or weighting the distance values directly by the gray levels in some manner. The DTOCS does not weight them that way. The DTOCS gives a weighted version of the chessboard distance map. The weights are not constant, but gray value differences of the original image. The difference between the DTOCS map and other distance transforms for gray level images is shown. The difference between the DTOCS and EDTOCS is that the EDTOCS calculates these gray level differences in a different way. It propagates local Euclidean distances inside a kernel. Analytical derivations of some results concerning the DTOCS and the EDTOCS are presented. Commonly distance transforms are used for feature extraction in pattern recognition and learning. Their use in image compression is very rare. This thesis introduces a new application area for distance transforms. Three new image compression algorithms based on the DTOCS and one based on the EDTOCS are presented. Control points, i.e. points that are considered fundamental for the reconstruction of the image, are selected from the gray level image using the DTOCS and the EDTOCS. The first group of methods select the maximas of the distance image to new control points and the second group of methods compare the DTOCS distance to binary image chessboard distance. The effect of applying threshold masks of different sizes along the threshold boundaries is studied. The time complexity of the compression algorithms is analyzed both analytically and experimentally. It is shown that the time complexity of the algorithms is independent of the number of control points, i.e. the compression ratio. Also a new morphological image decompression scheme is presented, the 8 kernels' method. Several decompressed images are presented. The best results are obtained using the Delaunay triangulation. The obtained image quality equals that of the DCT images with a 4 x 4

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Yritysten syvällinen ymmärrys työntekijöistä vaatii yrityksiltä monipuolista panostusta tiedonhallintaan. Tämän yhdistäminen ennakoivaan analytiikkaan ja tiedonlouhintaan mahdollistaa yrityksille uudenlaisen ulottuvuuden kehittää henkilöstöhallinnon toimintoja niin työntekijöiden kuin yrityksen etujen mukaisesti. Tutkielman tavoitteena oli selvittää tiedonlouhinnan hyödyntämistä henkilöstöhallinnossa. Tutkielma toteutettiin konstruktiivistä menetelmää hyödyntäen. Teoreettinen viitekehys keskittyi ennakoivan analytiikan ja tiedonlouhinnan konseptin ymmärtämiseen. Tutkielman empiriaosuus rakentui kvalitatiiviseen ja kvantitatiiviseen osiin. Kvalitatiivinen osa koostui tutkielman esitutkimuksesta, jossa käsiteltiin ennakoivan analytiikan ja tiedonlouhinnan hyödyntämistä. Kvantitatiivinen osa rakentui tiedonlouhintaprojektiin, joka toteutettiin henkilöstöhallintoon tutkien henkilöstövaihtuvuutta. Esitutkimuksen tuloksena tiedonlouhinnan hyödyntämisen haasteiksi ilmeni muun muassa tiedon omistajuus, osaaminen ja ymmärrys mahdollisuuksista. Tiedonlouhintaprojektin tuloksena voidaan todeta, että tutkimuksessa sovelletuista korrelaatioiden tutkimisista ja logistisesta regressioanalyysistä oli havaittavissa tilastollisia riippuvuuksia vapaaehtoisesti poistuvien työntekijöiden osalta.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aineistojen käsittely ja jalostaminen. Esitys Liikearkistopäiville 2015.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Leveraging cloud services, companies and organizations can significantly improve their efficiency, as well as building novel business opportunities. Cloud computing offers various advantages to companies while having some risks for them too. Advantages offered by service providers are mostly about efficiency and reliability while risks of cloud computing are mostly about security problems. Problems with security of the cloud still demand significant attention in order to tackle the potential problems. Security problems in the cloud as security problems in any area of computing, can not be fully tackled. However creating novel and new solutions can be used by service providers to mitigate the potential threats to a large extent. Looking at the security problem from a very high perspective, there are two focus directions. Security problems that threaten service user’s security and privacy are at one side. On the other hand, security problems that threaten service provider’s security and privacy are on the other side. Both kinds of threats should mostly be detected and mitigated by service providers. Looking a bit closer to the problem, mitigating security problems that target providers can protect both service provider and the user. However, the focus of research community mostly is to provide solutions to protect cloud users. A significant research effort has been put in protecting cloud tenants against external attacks. However, attacks that are originated from elastic, on-demand and legitimate cloud resources should still be considered seriously. The cloud-based botnet or botcloud is one of the prevalent cases of cloud resource misuses. Unfortunately, some of the cloud’s essential characteristics enable criminals to form reliable and low cost botclouds in a short time. In this paper, we present a system that helps to detect distributed infected Virtual Machines (VMs) acting as elements of botclouds. Based on a set of botnet related system level symptoms, our system groups VMs. Grouping VMs helps to separate infected VMs from others and narrows down the target group under inspection. Our system takes advantages of Virtual Machine Introspection (VMI) and data mining techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision trees are very powerful tools for classification in data mining tasks that involves different types of attributes. When coming to handling numeric data sets, usually they are converted first to categorical types and then classified using information gain concepts. Information gain is a very popular and useful concept which tells you, whether any benefit occurs after splitting with a given attribute as far as information content is concerned. But this process is computationally intensive for large data sets. Also popular decision tree algorithms like ID3 cannot handle numeric data sets. This paper proposes statistical variance as an alternative to information gain as well as statistical mean to split attributes in completely numerical data sets. The new algorithm has been proved to be competent with respect to its information gain counterpart C4.5 and competent with many existing decision tree algorithms against the standard UCI benchmarking datasets using the ANOVA test in statistics. The specific advantages of this proposed new algorithm are that it avoids the computational overhead of information gain computation for large data sets with many attributes, as well as it avoids the conversion to categorical data from huge numeric data sets which also is a time consuming task. So as a summary, huge numeric datasets can be directly submitted to this algorithm without any attribute mappings or information gain computations. It also blends the two closely related fields statistics and data mining

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports a novel region-based shape descriptor based on orthogonal Legendre moments. The preprocessing steps for invariance improvement of the proposed Improved Legendre Moment Descriptor (ILMD) are discussed. The performance of the ILMD is compared to the MPEG-7 approved region shape descriptor, angular radial transformation descriptor (ARTD), and the widely used Zernike moment descriptor (ZMD). Set B of the MPEG-7 CE-1 contour database and all the datasets of the MPEG-7 CE-2 region database were used for experimental validation. The average normalized modified retrieval rate (ANMRR) and precision- recall pair were employed for benchmarking the performance of the candidate descriptors. The ILMD has lower ANMRR values than ARTD for most of the datasets, and ARTD has a lower value compared to ZMD. This indicates that overall performance of the ILMD is better than that of ARTD and ZMD. This result is confirmed by the precision-recall test where ILMD was found to have better precision rates for most of the datasets tested. Besides retrieval accuracy, ILMD is more compact than ARTD and ZMD. The descriptor proposed is useful as a generic shape descriptor for content-based image retrieval (CBIR) applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este texto contribuirá a que la institución de salud se organice y prepare la información necesaria para emprender el largo y tortuoso camino de la determinación de la razón costo/beneficio y de la acreditación. Además, podrá ser muy útil para los estudiantes de los programas de pregrado y posgrado de ingeniería biomédica que se quieran especializar en la gestión de tecnologías del equipamiento biomédico y la ingeniería clínica. También podrá ser usado como guía de referencia por personas que estén directamente vinculadas al sector de la salud en departamentos de mantenimiento, ingeniería clínica o de servicios hospitalarios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La implementació de la Directiva Europea 91/271/CEE referent a tractament d'aigües residuals urbanes va promoure la construcció de noves instal·lacions al mateix temps que la introducció de noves tecnologies per tractar nutrients en àrees designades com a sensibles. Tant el disseny d'aquestes noves infraestructures com el redisseny de les ja existents es va portar a terme a partir d'aproximacions basades fonamentalment en objectius econòmics degut a la necessitat d'acabar les obres en un període de temps relativament curt. Aquests estudis estaven basats en coneixement heurístic o correlacions numèriques provinents de models determinístics simplificats. Així doncs, moltes de les estacions depuradores d'aigües residuals (EDARs) resultants van estar caracteritzades per una manca de robustesa i flexibilitat, poca controlabilitat, amb freqüents problemes microbiològics de separació de sòlids en el decantador secundari, elevats costos d'operació i eliminació parcial de nutrients allunyant-les de l'òptim de funcionament. Molts d'aquestes problemes van sorgir degut a un disseny inadequat, de manera que la comunitat científica es va adonar de la importància de les etapes inicials de disseny conceptual. Precisament per aquesta raó, els mètodes tradicionals de disseny han d'evolucionar cap a sistemes d'avaluació mes complexos, que tinguin en compte múltiples objectius, assegurant així un millor funcionament de la planta. Tot i la importància del disseny conceptual tenint en compte múltiples objectius, encara hi ha un buit important en la literatura científica tractant aquest camp d'investigació. L'objectiu que persegueix aquesta tesi és el de desenvolupar un mètode de disseny conceptual d'EDARs considerant múltiples objectius, de manera que serveixi d'eina de suport a la presa de decisions al seleccionar la millor alternativa entre diferents opcions de disseny. Aquest treball de recerca contribueix amb un mètode de disseny modular i evolutiu que combina diferent tècniques com: el procés de decisió jeràrquic, anàlisi multicriteri, optimació preliminar multiobjectiu basada en anàlisi de sensibilitat, tècniques d'extracció de coneixement i mineria de dades, anàlisi multivariant i anàlisi d'incertesa a partir de simulacions de Monte Carlo. Això s'ha aconseguit subdividint el mètode de disseny desenvolupat en aquesta tesis en quatre blocs principals: (1) generació jeràrquica i anàlisi multicriteri d'alternatives, (2) anàlisi de decisions crítiques, (3) anàlisi multivariant i (4) anàlisi d'incertesa. El primer dels blocs combina un procés de decisió jeràrquic amb anàlisi multicriteri. El procés de decisió jeràrquic subdivideix el disseny conceptual en una sèrie de qüestions mes fàcilment analitzables i avaluables mentre que l'anàlisi multicriteri permet la consideració de diferent objectius al mateix temps. D'aquesta manera es redueix el nombre d'alternatives a avaluar i fa que el futur disseny i operació de la planta estigui influenciat per aspectes ambientals, econòmics, tècnics i legals. Finalment aquest bloc inclou una anàlisi de sensibilitat dels pesos que proporciona informació de com varien les diferents alternatives al mateix temps que canvia la importància relativa del objectius de disseny. El segon bloc engloba tècniques d'anàlisi de sensibilitat, optimització preliminar multiobjectiu i extracció de coneixement per donar suport al disseny conceptual d'EDAR, seleccionant la millor alternativa un cop s'han identificat decisions crítiques. Les decisions crítiques són aquelles en les que s'ha de seleccionar entre alternatives que compleixen de forma similar els objectius de disseny però amb diferents implicacions pel que respecte a la futura estructura i operació de la planta. Aquest tipus d'anàlisi proporciona una visió més àmplia de l'espai de disseny i permet identificar direccions desitjables (o indesitjables) cap on el procés de disseny pot derivar. El tercer bloc de la tesi proporciona l'anàlisi multivariant de les matrius multicriteri obtingudes durant l'avaluació de les alternatives de disseny. Específicament, les tècniques utilitzades en aquest treball de recerca engloben: 1) anàlisi de conglomerats, 2) anàlisi de components principals/anàlisi factorial i 3) anàlisi discriminant. Com a resultat és possible un millor accés a les dades per realitzar la selecció de les alternatives, proporcionant més informació per a una avaluació mes efectiva, i finalment incrementant el coneixement del procés d'avaluació de les alternatives de disseny generades. En el quart i últim bloc desenvolupat en aquesta tesi, les diferents alternatives de disseny són avaluades amb incertesa. L'objectiu d'aquest bloc és el d'estudiar el canvi en la presa de decisions quan una alternativa és avaluada incloent o no incertesa en els paràmetres dels models que descriuen el seu comportament. La incertesa en el paràmetres del model s'introdueix a partir de funcions de probabilitat. Desprès es porten a terme simulacions Monte Carlo, on d'aquestes distribucions se n'extrauen números aleatoris que es subsisteixen pels paràmetres del model i permeten estudiar com la incertesa es propaga a través del model. Així és possible analitzar la variació en l'acompliment global dels objectius de disseny per a cada una de les alternatives, quines són les contribucions en aquesta variació que hi tenen els aspectes ambientals, legals, econòmics i tècnics, i finalment el canvi en la selecció d'alternatives quan hi ha una variació de la importància relativa dels objectius de disseny. En comparació amb les aproximacions tradicionals de disseny, el mètode desenvolupat en aquesta tesi adreça problemes de disseny/redisseny tenint en compte múltiples objectius i múltiples criteris. Al mateix temps, el procés de presa de decisions mostra de forma objectiva, transparent i sistemàtica el perquè una alternativa és seleccionada en front de les altres, proporcionant l'opció que més bé acompleix els objectius marcats, mostrant els punts forts i febles, les principals correlacions entre objectius i alternatives, i finalment tenint en compte la possible incertesa inherent en els paràmetres del model que es fan servir durant les anàlisis. Les possibilitats del mètode desenvolupat es demostren en aquesta tesi a partir de diferents casos d'estudi: selecció del tipus d'eliminació biològica de nitrogen (cas d'estudi # 1), optimització d'una estratègia de control (cas d'estudi # 2), redisseny d'una planta per aconseguir eliminació simultània de carboni, nitrogen i fòsfor (cas d'estudi # 3) i finalment anàlisi d'estratègies control a nivell de planta (casos d'estudi # 4 i # 5).