965 resultados para STATISTICAL DATA INTERPRETATION


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

"June 1995."

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last thirty years, the emergence and progression of biologging technology has led to great advances in marine predator ecology. Large databases of location and dive observations from biologging devices have been compiled for an increasing number of diving predator species (such as pinnipeds, sea turtles, seabirds and cetaceans), enabling complex questions about animal activity budgets and habitat use to be addressed. Central to answering these questions is our ability to correctly identify and quantify the frequency of essential behaviours, such as foraging. Despite technological advances that have increased the quality and resolution of location and dive data, accurately interpreting behaviour from such data remains a challenge, and analytical methods are only beginning to unlock the full potential of existing datasets. This review evaluates both traditional and emerging methods and presents a starting platform of options for future studies of marine predator foraging ecology, particularly from location and two-dimensional (time-depth) dive data. We outline the different devices and data types available, discuss the limitations and advantages of commonly-used analytical techniques, and highlight key areas for future research. We focus our review on pinnipeds - one of the most studied taxa of marine predators - but offer insights that will be applicable to other air-breathing marine predator tracking studies. We highlight that traditionally-used methods for inferring foraging from location and dive data, such as first-passage time and dive shape analysis, have important caveats and limitations depending on the nature of the data and the research question. We suggest that more holistic statistical techniques, such as state-space models, which can synthesise multiple track, dive and environmental metrics whilst simultaneously accounting for measurement error, offer more robust alternatives. Finally, we identify a need for more research to elucidate the role of physical oceanography, device effects, study animal selection, and developmental stages in predator behaviour and data interpretation.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

O objetivo do estudo foi apresentar a fração da variância intrapessoal para ajuste da distribuição de nutrientes de adultos e idosos. Utilizaram-se dados de inquérito populacional com amostra representativa (n = 511) de indivíduos com 19 anos ou mais do município de São Paulo, SP, em 2007. A fração da variância intrapessoal foi obtida pelo método proposto pela Iowa State University. Observaram-se diferenças nas frações das variâncias intrapessoais de nutrientes segundo sexo. Esses valores devem ser utilizados para ajustar a distribuição da ingestão de nutrientes, pois sua não utilização pode resultar em viés na análise e interpretação de dados.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analisar diferenças quanto a características sociodemográficas e relacionadas à saúde entre indivíduos com e sem linha telefônica residencial. Foram analisados os dados do Inquérito de Saúde (ISA-Capital) 2003, um estudo transversal realizado em São Paulo, SP, no mesmo ano. Os moradores que possuíam linha telefônica residencial foram comparados com os que disseram não possuir linha telefônica, segundo as variáveis sociodemográficas, de estilo de vida, estado de saúde e utilização de serviços de saúde. Foram estimados os vícios associados à não-cobertura por parte da população sem telefone, verificando-se sua diminuição após a utilização de ajustes de pós-estratificação. Dos 1.878 entrevistados acima de 18 anos, 80,1% possuía linha telefônica residencial. Na comparação entre os grupos, as principais diferenças sociodemográficas entre indivíduos que não possuíam linha residencial foram: menor idade, maior proporção de indivíduos de raça/cor negra e parda, menor proporção de entrevistados casada, maior proporção de desempregados e com menor escolaridade. Os moradores sem linha telefônica residencial realizavam menos exames de saúde, fumavam e bebiam mais. Ainda, esse grupo consumiu menos medicamentos, auto-avaliou-se em piores condições de saúde e usou mais o Sistema Único de Saúde. Ao se excluir da análise a população sem telefone, as estimativas de consultas odontológicas, alcoolismo, consumo de medicamentos e utilização do SUS para realização de Papanicolaou foram as que tiveram maior vício. Após o ajuste de pós-estratificação, houve diminuição do vício das estimativas para as variáveis associadas à posse de linha telefônica residencial. ) A exclusão dos moradores sem linha telefônica é uma das principais limitações das pesquisas realizadas por esse meio. No entanto, a utilização de técnicas estatísticas de ajustes de pós-estratificação permite a diminuição dos vícios de não cobertura

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Head and neck squamous cell carcinoma (HNSCC) is one of the most common malignancies in humans. The average 5-year survival rate is one of the lowest among aggressive cancers, showing no significant improvement in recent years. When detected early, HNSCC has a good prognosis, but most patients present metastatic disease at the time of diagnosis, which significantly reduces survival rate. Despite extensive research, no molecular markers are currently available for diagnostic or prognostic purposes. Methods: Aiming to identify differentially-expressed genes involved in laryngeal squamous cell carcinoma (LSCC) development and progression, we generated individual Serial Analysis of Gene Expression (SAGE) libraries from a metastatic and non-metastatic larynx carcinoma, as well as from a normal larynx mucosa sample. Approximately 54,000 unique tags were sequenced in three libraries. Results: Statistical data analysis identified a subset of 1,216 differentially expressed tags between tumor and normal libraries, and 894 differentially expressed tags between metastatic and non-metastatic carcinomas. Three genes displaying differential regulation, one down-regulated (KRT31) and two up-regulated (BST2, MFAP2), as well as one with a non-significant differential expression pattern (GNA15) in our SAGE data were selected for real-time polymerase chain reaction (PCR) in a set of HNSCC samples. Consistent with our statistical analysis, quantitative PCR confirmed the upregulation of BST2 and MFAP2 and the downregulation of KRT31 when samples of HNSCC were compared to tumor-free surgical margins. As expected, GNA15 presented a non-significant differential expression pattern when tumor samples were compared to normal tissues. Conclusion: To the best of our knowledge, this is the first study reporting SAGE data in head and neck squamous cell tumors. Statistical analysis was effective in identifying differentially expressed genes reportedly involved in cancer development. The differential expression of a subset of genes was confirmed in additional larynx carcinoma samples and in carcinomas from a distinct head and neck subsite. This result suggests the existence of potential common biomarkers for prognosis and targeted-therapy development in this heterogeneous type of tumor.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

3rd SMTDA Conference Proceedings, 11-14 June 2014, Lisbon Portugal.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study utilised recent developments in forensic aromatic hydrocarbon fingerprint analysis to characterise and identify specific biogenic, pyrogenic and petrogenic contamination. The fingerprinting and data interpretation techniques discussed include the recognition of: The distribution patterns of hydrocarbons (alkylated naphthalene, phenanthrene, dibenzothiophene, fluorene, chrysene and phenol isomers), • Analysis of “source-specific marker” compounds (individual saturated hydrocarbons, including n-alkanes (n-C5 through 0-C40) • Selected benzene, toluene, ethylbenzene and xylene isomers (BTEX), • The recalcitrant isoprenoids; pristane and phytane and • The determination of diagnostic ratios of specific petroleum / non-petroleum constituents, and the application of various statistical and numerical analysis tools. An unknown sample from the Irish Environmental Protection Agency (EPA) for origin characterisation was subjected to analysis by gas chromatography utilising both flame ionisation and mass spectral detection techniques in comparison to known reference materials. The percentage of the individual Polycyclic Aromatic Hydrocarbons (PAIIs) and biomarker concentrations in the unknown sample were normalised to the sum of the analytes and the results were compared with the corresponding results with a range of reference materials. In addition, to the determination of conventional diagnostic PAH and biomarker ratios, a number of “source-specific markers” isomeric PAHs within the same alkylation levels were determined, and their relative abundance ratios were computed in order to definitively identify and differentiate the various sources. Statistical logarithmic star plots were generated from both sets of data to give a pictorial representation of the comparison between the unknown sample and reference products. The study successfully characterised the unknown sample as being contaminated with a “coal tar” and clearly demonstrates the future role of compound ratio analysis (CORAT) in the identification of possible source contaminants.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Cardiovascular Diseases (CVD) are the leading cause of death in Brazil. Objective: To estimate total CVD, cerebrovascular disease (CBVD), and ischemic heart disease (IHD) mortality rates in adults in the counties of the state of Rio de Janeiro (SRJ), from 1979 to 2010. Methods: The counties of the SRJ were analysed according to their denominations stablished by the geopolitical structure of 1950, Each new county that have since been created, splitting from their original county, was grouped according to their former origin. Population Data were obtained from the Brazilian Institute of Geography and Statistics (IBGE), and data on deaths were obtained from DataSus/MS. Mean CVD, CBVD, and IHD mortality rates were estimated, compensated for deaths from ill-defined causes, and adjusted for age and sex using the direct method for three periods: 1979–1989, 1990–1999, and 2000–2010, Such results were spatially represented in maps. Tables were also constructed showing the mortality rates for each disease and year period. Results: There was a significant reduction in mortality rates across the three disease groups over the the three defined periods in all the county clusters analysed, Despite an initial mortality rate variation among the counties, it was observed a homogenization of such rates at the final period (2000–2010). The drop in CBVD mortality was greater than that in IHD mortality. Conclusion: Mortality due to CVD has steadily decreased in the SRJ in the last three decades. This reduction cannot be explained by greater access to high technology procedures or better control of cardiovascular risk factors as these facts have not occurred or happened in low proportion of cases with the exception of smoking which has decreased significantly. Therefore, it is necessary to seek explanations for this decrease, which may be related to improvements in the socioeconomic conditions of the population.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

AbstractBackground:Acute coronary syndrome (ACS) is defined as a “group of clinical symptoms compatible with acute myocardial ischemia”, representing the leading cause of death worldwide, with a high clinical and financial impact. In this sense, the development of economic studies assessing the costs related to the treatment of ACS should be considered.Objective:To evaluate costs and length of hospital stay between groups of patients treated for ACS undergoing angioplasty with or without stent implantation (stent+ / stent-), coronary artery bypass surgery (CABG) and treated only clinically (Clinical) from the perspective of the Brazilian Supplementary Health System (SHS).Methods:A retrospective analysis of medical claims of beneficiaries of health plans was performed considering hospitalization costs and length of hospital stay for management of patients undergoing different types of treatment for ACS, between Jan/2010 and Jun/2012.Results:The average costs per patient were R$ 18,261.77, R$ 30,611.07, R$ 37,454.94 and R$ 40,883.37 in the following groups: Clinical, stent-, stent+ and CABG, respectively. The average costs per day of hospitalization were R$ 1,987.03, R$ 4,024.72, R$ 6,033.40 and R$ 2,663.82, respectively. The average results for length of stay were 9.19 days, 7.61 days, 6.19 days and 15.20 days in these same groups. The differences were significant between all groups except Clinical and stent- and between stent + and CABG groups for cost analysis.Conclusion:Hospitalization costs of SCA are high in the Brazilian SHS, being significantly higher when interventional procedures are required.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Among the largest resources for biological sequence data is the large amount of expressed sequence tags (ESTs) available in public and proprietary databases. ESTs provide information on transcripts but for technical reasons they often contain sequencing errors. Therefore, when analyzing EST sequences computationally, such errors must be taken into account. Earlier attempts to model error prone coding regions have shown good performance in detecting and predicting these while correcting sequencing errors using codon usage frequencies. In the research presented here, we improve the detection of translation start and stop sites by integrating a more complex mRNA model with codon usage bias based error correction into one hidden Markov model (HMM), thus generalizing this error correction approach to more complex HMMs. We show that our method maintains the performance in detecting coding sequences.