22 resultados para high dimensional data, call detail records (CDR), wireless telecommunication industry

em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain


Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’anàlisi de l’efecte dels gens i els factors ambientals en el desenvolupament de malalties complexes és un gran repte estadístic i computacional. Entre les diverses metodologies de mineria de dades que s’han proposat per a l’anàlisi d’interaccions una de les més populars és el mètode Multifactor Dimensionality Reduction, MDR, (Ritchie i al. 2001). L’estratègia d’aquest mètode és reduir la dimensió multifactorial a u mitjançant l’agrupació dels diferents genotips en dos grups de risc: alt i baix. Tot i la seva utilitat demostrada, el mètode MDR té alguns inconvenients entre els quals l’agrupació excessiva de genotips pot fer que algunes interaccions importants no siguin detectades i que no permet ajustar per efectes principals ni per variables confusores. En aquest article il•lustrem les limitacions de l’estratègia MDR i d’altres aproximacions no paramètriques i demostrem la conveniència d’utilitzar metodologies parametriques per analitzar interaccions en estudis cas-control on es requereix l’ajust per variables confusores i per efectes principals. Proposem una nova metodologia, una versió paramètrica del mètode MDR, que anomenem Model-Based Multifactor Dimensionality Reduction (MB-MDR). La metodologia proposada té com a objectiu la identificació de genotips específics que estiguin associats a la malaltia i permet ajustar per efectes marginals i variables confusores. La nova metodologia s’il•lustra amb dades de l’Estudi Espanyol de Cancer de Bufeta.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Computerised databases of primary care clinical records are widely used for epidemiological research. In Catalonia, the InformationSystem for the Development of Research in Primary Care (SIDIAP) aims to promote the development of research based on high-quality validated data from primary care electronic medical records. Objective The purpose of this study is to create and validate a scoring system (Registry Quality Score, RQS) that will enable all primary care practices (PCPs) to be selected as providers of researchusable data based on the completeness of their registers. Methods Diseases that were likely to be representative of common diagnoses seen in primary care were selected for RQS calculations. The observed/ expected cases ratio was calculated for each disease. Once we had obtained an estimated value for this ratio for each of the selected conditions we added up the ratios calculated for each condition to obtain a final RQS. Rate comparisons between observed and published prevalences of diseases not included in the RQS calculations (atrial fibrillation, diabetes, obesity, schizophrenia, stroke, urinary incontinenceand Crohn’s disease) were used to set the RQS cutoff which will enable researchers to select PCPs with research-usable data. Results Apart from Crohn’s disease, all prevalences were the same as those published from the RQS fourth quintile (60th percentile) onwards. This RQS cut-off provided a total population of 1 936 443 (39.6% of the total SIDIAP population). Conclusions SIDIAP is highly representative of the population of Catalonia in terms of geographical, age and sex distributions. We report the usefulness of rate comparison as a valid method to establish research-usable data within primary care electronic medical records

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we introduce a highly efficient reversible data hiding system. It is based on dividing the image into tiles and shifting the histograms of each image tile between its minimum and maximum frequency. Data are then inserted at the pixel level with the largest frequency to maximize data hiding capacity. It exploits the special properties of medical images, where the histogram of their nonoverlapping image tiles mostly peak around some gray values and the rest of the spectrum is mainlyempty. The zeros (or minima) and peaks (maxima) of the histograms of the image tiles are then relocated to embed the data. The grey values of some pixels are therefore modified.High capacity, high fidelity, reversibility and multiple data insertions are the key requirements of data hiding in medical images. We show how histograms of image tiles of medical images can be exploited to achieve these requirements. Compared with data hiding method applied to the whole image, our scheme can result in 30%-200% capacity improvement and still with better image quality, depending on the medical image content. Additional advantages of the proposed method include hiding data in the regions of non-interest and better exploitation of spatial masking.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vagueness and high dimensional space data are usual features of current data. The paper is an approach to identify conceptual structures among fuzzy three dimensional data sets in order to get conceptual hierarchy. We propose a fuzzy extension of the Galois connections that allows to demonstrate an isomorphism theorem between fuzzy sets closures which is the basis for generating lattices ordered-sets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Self-organizing maps (Kohonen 1997) is a type of artificial neural network developedto explore patterns in high-dimensional multivariate data. The conventional versionof the algorithm involves the use of Euclidean metric in the process of adaptation ofthe model vectors, thus rendering in theory a whole methodology incompatible withnon-Euclidean geometries.In this contribution we explore the two main aspects of the problem:1. Whether the conventional approach using Euclidean metric can shed valid resultswith compositional data.2. If a modification of the conventional approach replacing vectorial sum and scalarmultiplication by the canonical operators in the simplex (i.e. perturbation andpowering) can converge to an adequate solution.Preliminary tests showed that both methodologies can be used on compositional data.However, the modified version of the algorithm performs poorer than the conventionalversion, in particular, when the data is pathological. Moreover, the conventional ap-proach converges faster to a solution, when data is \well-behaved".Key words: Self Organizing Map; Artificial Neural networks; Compositional data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we study how the access price affects the choice of the tariff regime taken by the network operators. We show that for high values of the access price, that is taken as a parameter by the firms, networks decide to charge only the callers. Otherwise, for low values of the access charge, networks charge also the receivers. Moreover, we compare market penetration and total welfare between the two price regimes. Our model suggests that, for high values of call externality, market penetration and total welfare are larger in Receiving Party Pays regime when the access charge is close to zero.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Graphical displays which show inter--sample distances are importantfor the interpretation and presentation of multivariate data. Except whenthe displays are two--dimensional, however, they are often difficult tovisualize as a whole. A device, based on multidimensional unfolding, isdescribed for presenting some intrinsically high--dimensional displays infewer, usually two, dimensions. This goal is achieved by representing eachsample by a pair of points, say $R_i$ and $r_i$, so that a theoreticaldistance between the $i$-th and $j$-th samples is represented twice, onceby the distance between $R_i$ and $r_j$ and once by the distance between$R_j$ and $r_i$. Self--distances between $R_i$ and $r_i$ need not be zero.The mathematical conditions for unfolding to exhibit symmetry are established.Algorithms for finding approximate fits, not constrained to be symmetric,are discussed and some examples are given.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectiu: Aquest estudi pretén aportar coneixement sobre el model d’atenció que reben les persones de més de 79 anys dependents del municipi de Vic. Analitzar en quina mesura es fa ús dels serveis formals i quines variables influeixen en la utilització d’aquest. Mètode: Estudi retrospectiu, descriptiu i transversal. De metodologia quantitativa. La població d’estudi són aquelles persones de 80 i més anys de Vic que van sol·licitar la valoració de dependència entre els anys 2007-2010, amb un grau II o III de dependència reconegut i un Pla Individual d’Atenció validat i concedit per la Generalitat de Catalunya (n=453). Les dades provenen de registres de la Generalitat de Catalunya i de l’Àrea d’Afers Socials i Ciutadania de l’Ajuntament de Vic. Les variables dependents són la utilització de recursos formals (teleassitència, servei d’atenció domiciliària –públic i privat- , centre de dia, residència i prestacions econòmiques derivades de la llei de la dependència). El grau de dependència, el gènere, l’edat, l’estat civil, la convivència, el cuidador principal i el nivell de renda es van considerar variables independents. Resultats: El model d’atenció majoritari és el que complementa el suport informal amb el formal (62.3%). L’ús de recursos formals té un paper subsidiari (37.7%). La variable convivència influeix de forma significativa amb l’ús de serveis formals (p&0.001 en l’ús de TAS, el SAD públic i el SAD privat) . Conclusió: Els disseny de programes i criteris de provisió de serveis haurien de contemplar no només el grau de dependència sinó també variables més d’entorn com la convivència. No obstant, existeix encara poca evidència científica en aquesta línia, per això s’hauria de potenciar l’ investigació que permetés analitzar les variables de la funció social de forma més acurada. Paraules clau: Dependència, suport formal, suport informal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectiu: Aquest estudi pretén aportar coneixement sobre el model d’atenció que reben les persones de més de 79 anys dependents del municipi de Vic. Analitzar en quina mesura es fa ús dels serveis formals i quines variables influeixen en la utilització d’aquest. Mètode: Estudi retrospectiu, descriptiu i transversal. De metodologia quantitativa. La població d’estudi són aquelles persones de 80 i més anys de Vic que van sol·licitar la valoració de dependència entre els anys 2007-2010, amb un grau II o III de dependència reconegut i un Pla Individual d’Atenció validat i concedit per la Generalitat de Catalunya (n=453). Les dades provenen de registres de la Generalitat de Catalunya i de l’Àrea d’Afers Socials i Ciutadania de l’Ajuntament de Vic. Les variables dependents són la utilització de recursos formals (teleassitència, servei d’atenció domiciliària –públic i privat- , centre de dia, residència i prestacions econòmiques derivades de la llei de la dependència). El grau de dependència, el gènere, l’edat, l’estat civil, la convivència, el cuidador principal i el nivell de renda es van considerar variables independents. Resultats: El model d’atenció majoritari és el que complementa el suport informal amb el formal (62.3%). L’ús de recursos formals té un paper subsidiari (37.7%). La variable convivència influeix de forma significativa amb l’ús de serveis formals (p<0.001 en l’ús de TAS, el SAD públic i el SAD privat) . Conclusió: Els disseny de programes i criteris de provisió de serveis haurien de contemplar no només el grau de dependència sinó també variables més d’entorn com la convivència. No obstant, existeix encara poca evidència científica en aquesta línia, per això s’hauria de potenciar l’ investigació que permetés analitzar les variables de la funció social de forma més acurada. Paraules clau: Dependència, suport formal, suport informal.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Within last few years a new type of instruments called Terrestrial Laser Scanners (TLS) entered to the commercial market. These devices brought a possibility to obtain completely new type of spatial, three dimensional data describing the object of interest. TLS instruments are generating a type of data that needs a special treatment. Appearance of this technique made possible to monitor deformations of very large objects, like investigated here landslides, with new quality level. This change is visible especially with relation to the size and number of the details that can be observed with this new method. Taking into account this context presented here work is oriented on recognition and characterization of raw data received from the TLS instruments as well as processing phases, tools and techniques to do them. Main objective are definition and recognition of the problems related with usage of the TLS data, characterization of the quality single point generated by TLS, description and investigation of the TLS processing approach for landslides deformation measurements allowing to obtain 3D deformation characteristic and finally validation of the obtained results. The above objectives are based on the bibliography studies and research work followed by several experiments that will prove the conclusions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En aquest projecte final de carrera es presenta un sistema capaç de gestionar i emmagatzemar leshistòries mèdiques dels pacients. El sistema permetrà realitzar operacions de lectura i modificació de dades sobre els expedients mèdics de manera segura i fiable tenint en compte que els accessos a la informació s'efectuen a través d'una xarxa de comunicació.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El presente Trabajo Final de Carrera (TFC) está centrado en la Gestión de un Proyecto de Implantación de un Repositorio de Objetos Digitales de Aprendizaje en una Universidad, y queda englobado en el área de Gestión de Proyectos de la Ingeniería Técnica Informática de Gestión.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next Generation Access Networks (NGAN) are the new step forward to deliver broadband services and to facilitate the integration of different technologies. It is plausible to assume that, from a technological standpoint, the Future Internet will be composed of long-range high-speed optical networks; a number of wireless networks at the edge; and, in between, several access technologies, among which, the Passive Optical Networks (xPON) are very likely to succeed, due to their simplicity, low-cost, and increased bandwidth. Among the different PON technologies, the Ethernet-PON (EPON) is the most promising alternative to satisfy operator and user needs, due to its cost, flexibility and interoperability with other technologies. One of the most interesting challenges in such technologies relates to the scheduling and allocation of resources in the upstream (shared) channel. The aim of this research project is to study and evaluate current contributions and propose new efficient solutions to address the resource allocation issues in Next Generation EPON (NG-EPON). Key issues in this context are future end-user needs, integrated quality of service (QoS) support and optimized service provisioning for real time and elastic flows. This project will unveil research opportunities, issue recommendations and propose novel mechanisms associated with the convergence within heterogeneous access networks and will thus serve as a basis for long-term research projects in this direction. The project has served as a platform for the generation of new concepts and solutions that were published in national and international conferences, scientific journals and also in book chapter. We expect some more research publications in addition to the ones mentioned to be generated in a few months.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paper proposes a numerical solution method for general equilibrium models with a continuum of heterogeneous agents, which combines elements of projection and of perturbation methods. The basic idea is to solve first for the stationary solutionof the model, without aggregate shocks but with fully specified idiosyncratic shocks. Afterwards one computes a first-order perturbation of the solution in the aggregate shocks. This approach allows to include a high-dimensional representation of the cross-sectional distribution in the state vector. The method is applied to a model of household saving with uninsurable income risk and liquidity constraints. The model includes not only productivity shocks, but also shocks to redistributive taxation, which cause substantial short-run variation in the cross-sectional distribution of wealth. If those shocks are operative, it is shown that a solution method based on very few statistics of the distribution is not suitable, while the proposed method can solve the model with high accuracy, at least for the case of small aggregate shocks. Techniques are discussed to reduce the dimension of the state space such that higher order perturbations are feasible.Matlab programs to solve the model can be downloaded.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work proposes novel network analysis techniques for multivariate time series.We define the network of a multivariate time series as a graph where verticesdenote the components of the process and edges denote non zero long run partialcorrelations. We then introduce a two step LASSO procedure, called NETS, toestimate high dimensional sparse Long Run Partial Correlation networks. This approachis based on a VAR approximation of the process and allows to decomposethe long run linkages into the contribution of the dynamic and contemporaneousdependence relations of the system. The large sample properties of the estimatorare analysed and we establish conditions for consistent selection and estimation ofthe non zero long run partial correlations. The methodology is illustrated with anapplication to a panel of U.S. bluechips.