877 resultados para data-mining application
Resumo:
En aquest treball de fi de carrera s¿intentarà implementar un escenari segur client-servidor que utilitzi un proveïdor d¿identitat extern per efectuar les validacions relatives a la identitat i permisos associats d¿un client a l¿hora d¿accedir a un recurs, allotjat en un proveïdor de serveis.
Resumo:
Estudio de minería de datos sobre las causas del abandono de los estudiantes de una carrera de la UOC
Resumo:
In the past, sensors networks in cities have been limited to fixed sensors, embedded in particular locations, under centralised control. Today, new applications can leverage wireless devices and use them as sensors to create aggregated information. In this paper, we show that the emerging patterns unveiled through the analysis of large sets of aggregated digital footprints can provide novel insights into how people experience the city and into some of the drivers behind these emerging patterns. We particularly explore the capacity to quantify the evolution of the attractiveness of urban space with a case study of in the area of the New York City Waterfalls, a public art project of four man-made waterfalls rising from the New York Harbor. Methods to study the impact of an event of this nature are traditionally based on the collection of static information such as surveys and ticket-based people counts, which allow to generate estimates about visitors’ presence in specific areas over time. In contrast, our contribution makes use of the dynamic data that visitors generate, such as the density and distribution of aggregate phone calls and photos taken in different areas of interest and over time. Our analysis provides novel ways to quantify the impact of a public event on the distribution of visitors and on the evolution of the attractiveness of the points of interest in proximity. This information has potential uses for local authorities, researchers, as well as service providers such as mobile network operators.
Resumo:
For the last decade, high-resolution (HR)-MS has been associated with qualitative analyses while triple quadrupole MS has been associated with routine quantitative analyses. However, a shift of this paradigm is taking place: quantitative and qualitative analyses will be increasingly performed by HR-MS, and it will become the common 'language' for most mass spectrometrists. Most analyses will be performed by full-scan acquisitions recording 'all' ions entering the HR-MS with subsequent construction of narrow-width extracted-ion chromatograms. Ions will be available for absolute quantification, profiling and data mining. In parallel to quantification, metabotyping will be the next step in clinical LC-MS analyses because it should help in personalized medicine. This article is aimed to help analytical chemists who perform targeted quantitative acquisitions with triple quadrupole MS make the transition to quantitative and qualitative analyses using HR-MS. Guidelines for the acceptance criteria of mass accuracy and for the determination of mass extraction windows in quantitative analyses are proposed.
Resumo:
In 1993, Iowa Workforce Development (then the Department of Employment Services) conducted a survey to determine if there was a gender gap in wages paid. The results of that survey indicated that women were paid 68 cents per dollar paid to males. We felt a need to determine if this relationship of wages paid to each gender has changed since the 1993 study. In 1999, the Commission on the Status of Women requested that Iowa Workforce Development conduct research to update the 1993 information. A survey, cosponsored by the Commission on the Status of Women and Iowa Workforce Development, was conducted in 1999. The results of the survey showed that women earned 73 percent of what men earned when both jobs were considered. (The survey asked respondents to provide information on a primary job and a secondary job.) The ratio for the primary job was 72 percent, while the ratio for the secondary job was 85 percent. Additional survey results detail the types of jobs respondents had, the types of companies for which they worked and the education and experience levels. All of these characteristics can contribute to these ratios. While the large influx of women into the labor force may be over, it is still important to look at such information to determine if future action is needed. We present these results with that goal in mind. We are indebted to those Iowans, female and male, who voluntarily completed the survey. This study was completed under the general direction of Judy Erickson. The report was written by Shazada Khan, Teresa Wageman, Ann Wagner, and Yvonne Younes with administrative and technical assistance from Michael Blank, Margaret Lee and Gary Wilson. The Iowa State University Statistical Lab provided sampling advice, data entry and coding and data analysis.
Resumo:
L'objectiu principal d'aquest projecte és estudiar diverses eines de ticketing i analitzar les seves característiques per poder escollir amb criteri aquella que resulti més convenient i sobre la qual realitzar les modificacions necessàries per adaptar-la a l'àmbit dels drets ARCO.
Resumo:
Estimates/projections for age 60+ for the state and for its counties and incorporated places. DEA also provides population estimates on poverty, race and ethnicity, and urban and rural for age 60+. This statistical information is obtained from numerous resources, including the State Data Center of Iowa, US Census Bureau, the Administration on Aging, and Iowa State University Census Services. "The Census Bureau uses the latest available estimates as starting points for population projections. Sometimes the user may see both an estimate and a projection available for the same reference date, which may not agree because they were produced at different times. In such cases, estimates are the preferred data." (Source: State Data Center)
Resumo:
The Department of Elder Affairs maintains and provides population and demographic estimates/projections for age 60+ for the state and for its counties and incorporated places. DEA also provides population estimates on poverty, race and ethnicity, and urban and rural for age 60+. This statistical information is obtained from numerous resources, including the State Data Center of Iowa, US Census Bureau, the Administration on Aging, and Iowa State University Census Services. "The Census Bureau uses the latest available estimates as starting points for population projections. Sometimes the user may see both an estimate and a projection available for the same reference date, which may not agree because they were produced at different times. In such cases, estimates are the preferred data." (Source: State Data Center)
Resumo:
Metabolite profiling is critical in many aspects of the life sciences, particularly natural product research. Obtaining precise information on the chemical composition of complex natural extracts (metabolomes) that are primarily obtained from plants or microorganisms is a challenging task that requires sophisticated, advanced analytical methods. In this respect, significant advances in hyphenated chromatographic techniques (LC-MS, GC-MS and LC-NMR in particular), as well as data mining and processing methods, have occurred over the last decade. Together, these tools, in combination with bioassay profiling methods, serve an important role in metabolomics for the purposes of both peak annotation and dereplication in natural product research. In this review, a survey of the techniques that are used for generic and comprehensive profiling of secondary metabolites in natural extracts is provided. The various approaches (chromatographic methods: LC-MS, GC-MS, and LC-NMR and direct spectroscopic methods: NMR and DIMS) are discussed with respect to their resolution and sensitivity for extract profiling. In addition the structural information that can be generated through these techniques or in combination, is compared in relation to the identification of metabolites in complex mixtures. Analytical strategies with applications to natural extracts and novel methods that have strong potential, regardless of how often they are used, are discussed with respect to their potential applications and future trends.
Resumo:
Com características morfológicas e edafo-climáticas extremamente diversificadas, a ilha de Santo Antão em Cabo Verde apresenta uma reconhecida vulnerabilidade ambiental a par de uma elevada carência de estudos científicos que incidam sobre essa realidade e sirvam de base à uma compreensão integrada dos fenómenos. A cartografia digital e as tecnologias de informação geográfica vêm proporcionando um avanço tecnológico na colecção, armazenamento e processamento de dados espaciais. Várias ferramentas actualmente disponíveis permitem modelar uma multiplicidade de factores, localizar e quantificar os fenómenos bem como e definir os níveis de contribuição de diferentes factores no resultado final. No presente estudo, desenvolvido no âmbito do curso de pós-graduação e mestrado em sistemas de Informação geográfica realizado pela Universidade de Trás-os-Montes e Alto Douro, pretende-se contribuir para a minimização do deficit de informação relativa às características biofísicas da citada ilha, recorrendo-se à aplicação de tecnologias de informação geográfica e detecção remota, associadas à análise estatística multivariada. Nesse âmbito, foram produzidas e analisadas cartas temáticas e desenvolvido um modelo de análise integrada de dados. Com efeito, a multiplicidade de variáveis espaciais produzidas, de entre elas 29 variáveis com variação contínua passíveis de influenciar as características biofísicas da região e, possíveis ocorrências de efeitos mútuos antagónicos ou sinergéticos, condicionam uma relativa complexidade à interpretação a partir dos dados originais. Visando contornar este problema, recorre-se a uma rede de amostragem sistemática, totalizando 921 pontos ou repetições, para extrair os dados correspondentes às 29 variáveis nos pontos de amostragem e, subsequente desenvolvimento de técnicas de análise estatística multivariada, nomeadamente a análise em componentes principais. A aplicação destas técnicas permitiu simplificar e interpretar as variáreis originais, normalizando-as e resumindo a informação contida na diversidade de variáveis originais, correlacionadas entre si, num conjunto de variáveis ortogonais (não correlacionadas), e com níveis de importância decrescente, as componentes principais. Fixou-se como meta a concentração de 75% da variância dos dados originais explicadas pelas primeiras 3 componentes principais e, desenvolveu-se um processo interactivo em diferentes etapas, eliminando sucessivamente as variáveis menos representativas. Na última etapa do processo as 3 primeiras CP resultaram em 74,54% da variância dos dados originais explicadas mas, que vieram a demonstrar na fase posterior, serem insuficientes para retratar a realidade. Optou-se pela inclusão da 4ª CP (CP4), com a qual 84% da referida variância era explicada e, representando oito variáveis biofísicas: a altitude, a densidade hidrográfica, a densidade de fracturação geológica, a precipitação, o índice de vegetação, a temperatura, os recursos hídricos e a distância à rede hidrográfica. A subsequente interpolação da 1ª componente principal (CP1) e, das principais variáveis associadas as componentes CP2, CP3 e CP4 como variáveis auxiliares, recorrendo a técnicas geoestatística em ambiente ArcGIS permitiu a obtenção de uma carta representando 84% da variação das características biofísicas no território. A análise em clusters validada pelo teste “t de Student” permitiu reclassificar o território em 6 unidades biofísicas homogéneas. Conclui-se que, as tecnologias de informação geográfica actualmente disponíveis a par de facilitar análises interactivas e flexíveis, possibilitando que se faça variar temas e critérios, integrar novas informações e introduzir melhorias em modelos construídos com bases em informações disponíveis num determinado contexto, associadas a técnicas de análise estatística multivariada, possibilitam, com base em critérios científicos, desenvolver a análise integrada de múltiplas variáveis biofísicas cuja correlação entre si, torna complexa a compreensão integrada dos fenómenos.
Resumo:
A realidade mundial é preocupante no que diz respeito ao aumento de ocorrências de perdas e fraudes em redes de distribuição de energia eléctrica. Em Cabo Verde, mas precisamente na Cidade da Praia a realidade é ainda mais preocupante devido ao número de ocorrências e a gravidade dos mesmos. Propõe-se um trabalho de investigação sobre perdas e fraudes de energia eléctrica baseado na análise dos dados relativos aos registos dos clientes na Base de Dados da Electra (Cabo Verde), com o intuito de nortear as tomadas de decisões de gestão estratégica no que diz respeito às políticas de controlo e prevenção de perdas e fraudes de energia eléctrica. O trabalho baseia-se na recolha e selecção de dados a organizar numa Data Warehouse para depois aplicar as tecnologias OLAP para a identificação de perdas nos Postos de Transformação e zonas geográficas da Cidade da Praia em Cabo Verde e posteriormente identificar possíveis fraudes de energia eléctrica nos clientes finais utilizando Data Mining. Os resultados principais consistiram na identificação de situações de perdas de energia eléctrica nos Postos de Transformação, a identificação de áreas críticas seleccionadas para inspecção dos seus clientes finais e a detecção de padrões de anomalias associadas ao perfil dos clientes.
Resumo:
O presente trabalho destinada para o complemento de grau de licenciatura tem como objectivo principal analisar o auxílio de Business Intelligence (BI) às organizações na sua melhoria contínua no desempenho e qualidade de serviços, sobretudo no processo de tomada de decisão e estudo da sua existência na Cabo Verde Telecom. As tecnologias associadas a ele, nomeadamente, data warehouse, data mining e olap são primordiais para a tomada de decisão sobre as actividades estratégicas no mercado de negócios. Essas tecnologias permitem uma análise cuidada dos dados, transformando-os em informações pertinentes para a tomada de decisão nas empresas, garantindo com isto o seu crescimento no mercado.
Resumo:
Many classifiers achieve high levels of accuracy but have limited applicability in real world situations because they do not lead to a greater understanding or insight into the^way features influence the classification. In areas such as health informatics a classifier that clearly identifies the influences on classification can be used to direct research and formulate interventions. This research investigates the practical applications of Automated Weighted Sum, (AWSum), a classifier that provides accuracy comparable to other techniques whilst providing insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. The merits of this approach in classification and insight are evaluated on a Cystic Fibrosis and Diabetes datasets with positive results.
Resumo:
L'objectiu d'aquest treball serà fer mineria d'opinions de la xarxa social de microblogging Twitter. En primer lloc, durem a terme una tasca de classificació de sentiments fent servir un lexicó simple. A continuació, emprarem la tècnica de les regles d'associació i, finalment, farem tasques de clustering.
Resumo:
Purpose:To describe a novel in silico method to gather and analyze data from high-throughput heterogeneous experimental procedures, i.e. gene and protein expression arrays. Methods:Each microarray is assigned to a database which handles common data (names, symbols, antibody codes, probe IDs, etc.). Links between informations are automatically generated from knowledge obtained in freely accessible databases (NCBI, Swissprot, etc). Requests can be made from any point of entry and the displayed result is fully customizable. Results:The initial database has been loaded with two sets of data: a first set of data originating from an Affymetrix-based retinal profiling performed in an RPE65 knock-out mouse model of Leber's congenital amaurosis. A second set of data generated from a Kinexus microarray experiment done on the retinas from the same mouse model has been added. Queries display wild type versus knock out expressions at several time points for both genes and proteins. Conclusions:This freely accessible database allows for easy consultation of data and facilitates data mining by integrating experimental data and biological pathways.