364 resultados para Dataset


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The rise of social media as communication channels has enabled customers to provide feedback or to ask for assistance quickly and easily. In the context of brand crises, the microblogging platform Twitter is highly relevant because of its ability to support information sharing. By investigating communication on Twitter, the authors examine Twitter activity patterns based on a dataset of some 240,000 tweets during two major brand crises affecting the Australian airline Qantas – the volcanic ash cloud caused by the eruption of Chilean volcano Puyehue in June 2011, and the global grounding of Qantas flights by management in the course of an industrial dispute in October/November 2011. Through this case study we find that characteristics of communication change significantly during different stages of the crisis. Further, we demonstrate that different kinds of crisis result in different communication patterns on Twitter.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional text classification technology based on machine learning and data mining techniques has made a big progress. However, it is still a big problem on how to draw an exact decision boundary between relevant and irrelevant objects in binary classification due to much uncertainty produced in the process of the traditional algorithms. The proposed model CTTC (Centroid Training for Text Classification) aims to build an uncertainty boundary to absorb as many indeterminate objects as possible so as to elevate the certainty of the relevant and irrelevant groups through the centroid clustering and training process. The clustering starts from the two training subsets labelled as relevant or irrelevant respectively to create two principal centroid vectors by which all the training samples are further separated into three groups: POS, NEG and BND, with all the indeterminate objects absorbed into the uncertain decision boundary BND. Two pairs of centroid vectors are proposed to be trained and optimized through the subsequent iterative multi-learning process, all of which are proposed to collaboratively help predict the polarities of the incoming objects thereafter. For the assessment of the proposed model, F1 and Accuracy have been chosen as the key evaluation measures. We stress the F1 measure because it can display the overall performance improvement of the final classifier better than Accuracy. A large number of experiments have been completed using the proposed model on the Reuters Corpus Volume 1 (RCV1) which is important standard dataset in the field. The experiment results show that the proposed model has significantly improved the binary text classification performance in both F1 and Accuracy compared with three other influential baseline models.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Diabetic macular edema (DME) is one of the most common causes of visual loss among diabetes mellitus patients. Early detection and successive treatment may improve the visual acuity. DME is mainly graded into non-clinically significant macular edema (NCSME) and clinically significant macular edema according to the location of hard exudates in the macula region. DME can be identified by manual examination of fundus images. It is laborious and resource intensive. Hence, in this work, automated grading of DME is proposed using higher-order spectra (HOS) of Radon transform projections of the fundus images. We have used third-order cumulants and bispectrum magnitude, in this work, as features, and compared their performance. They can capture subtle changes in the fundus image. Spectral regression discriminant analysis (SRDA) reduces feature dimension, and minimum redundancy maximum relevance method is used to rank the significant SRDA components. Ranked features are fed to various supervised classifiers, viz. Naive Bayes, AdaBoost and support vector machine, to discriminate No DME, NCSME and clinically significant macular edema classes. The performance of our system is evaluated using the publicly available MESSIDOR dataset (300 images) and also verified with a local dataset (300 images). Our results show that HOS cumulants and bispectrum magnitude obtained an average accuracy of 95.56 and 94.39 % for MESSIDOR dataset and 95.93 and 93.33 % for local dataset, respectively.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background As the increasing adoption of information technology continues to offer better distant medical services, the distribution of, and remote access to digital medical images over public networks continues to grow significantly. Such use of medical images raises serious concerns for their continuous security protection, which digital watermarking has shown great potential to address. Methods We present a content-independent embedding scheme for medical image watermarking. We observe that the perceptual content of medical images varies widely with their modalities. Recent medical image watermarking schemes are image-content dependent and thus they may suffer from inconsistent embedding capacity and visual artefacts. To attain the image content-independent embedding property, we generalise RONI (region of non-interest, to the medical professionals) selection process and use it for embedding by utilising RONI’s least significant bit-planes. The proposed scheme thus avoids the need for RONI segmentation that incurs capacity and computational overheads. Results Our experimental results demonstrate that the proposed embedding scheme performs consistently over a dataset of 370 medical images including their 7 different modalities. Experimental results also verify how the state-of-the-art reversible schemes can have an inconsistent performance for different modalities of medical images. Our scheme has MSSIM (Mean Structural SIMilarity) larger than 0.999 with a deterministically adaptable embedding capacity. Conclusions Our proposed image-content independent embedding scheme is modality-wise consistent, and maintains a good image quality of RONI while keeping all other pixels in the image untouched. Thus, with an appropriate watermarking framework (i.e., with the considerations of watermark generation, embedding and detection functions), our proposed scheme can be viable for the multi-modality medical image applications and distant medical services such as teleradiology and eHealth.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper deals with a finite element modelling method for thin layer mortared masonry systems. In this method, the mortar layers including the interfaces are represented using a zero thickness interface element and the masonry units are modelled using an elasto-plastic, damaging solid element. The interface element is formulated using two regimes; i) shear-tension and ii) shearcompression. In the shear-tension regime, the failure of joint is consiedered through an eliptical failure criteria and in shear-compression it is considered through Mohr Coulomb type failure criterion. An explicit integration scheme is used in an implicit finite element framework for the formulation of the interface element. The model is calibrated with an experimental dataset from thin layer mortared masonry prism subjected to uniaxial compression, a triplet subjected to shear loads a beam subjected to flexural loads and used to predict the response of thin layer mortared masonry wallettes under orthotropic loading. The model is found to simulate the behaviour of a thin layer mortated masonry shear wall tested under pre-compression and inplane shear quite adequately. The model is shown to reproduce the failure of masonry panels under uniform biaxial state of stresses.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

On 19 June 2015, representatives from over 40 Australian research institutions gathered in Canberra to launch their Open Data Collections. The one day event, hosted by the Australian National Data Service (ANDS), showcased to government and a range of national stakeholders the rich variety of data collections that have been generated through the Major Open Data Collections (MODC) project. Colin Eustace attended the showcase for QUT Library and presented a poster that reflected the work that he and Jodie Vaughan generated through the project. QUT’s Blueprint 4, the University’s five-year institutional strategic plan, outlines the key priorities of developing a commitment to working in partnership with industry, as well as combining disciplinary strengths with interdisciplinary application. The Division of Technology, Information and Learning Support (TILS) has undertaken a number of Australian National Data Service (ANDS) funded projects since 2009 with the aim of developing improved research data management services within the University to support these strategic aims. By leveraging existing tools and systems developed during these projects, the Major Open Data Collection (MODC) project delivered support to multi-disciplinary collaborative research activities through partnership building between QUT researchers and Queensland government agencies, in order to add to and promote the discovery and reuse of a collection of spatially referenced datasets. The MODC project built upon existing Research Data Finder infrastructure (which uses VIVO open source software, developed by Cornell University) to develop a separate collection, Spatial Data Finder (https://researchdatafinder.qut.edu.au/spatial) as the interface to display the spatial data collection. During the course of the project, 62 dataset descriptions were added to Spatial Data Finder, 7 added to Research Data Finder and two added to Software Finder, another separate collection. The project team met with 116 individual researchers and attended 13 school and faculty meetings to promote the MODC project and raise awareness of the Library’s services and resources for research data management.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objectives Demonstrate the application of decision trees – classification and regression trees (CARTs), and their cousins, boosted regression trees (BRTs) – to understand structure in missing data. Setting Data taken from employees at three different industry sites in Australia. Participants 7915 observations were included. Materials and Methods The approach was evaluated using an occupational health dataset comprising results of questionnaires, medical tests, and environmental monitoring. Statistical methods included standard statistical tests and the ‘rpart’ and ‘gbm’ packages for CART and BRT analyses, respectively, from the statistical software ‘R’. A simulation study was conducted to explore the capability of decision tree models in describing data with missingness artificially introduced. Results CART and BRT models were effective in highlighting a missingness structure in the data, related to the Type of data (medical or environmental), the site in which it was collected, the number of visits and the presence of extreme values. The simulation study revealed that CART models were able to identify variables and values responsible for inducing missingness. There was greater variation in variable importance for unstructured compared to structured missingness. Discussion Both CART and BRT models were effective in describing structural missingness in data. CART models may be preferred over BRT models for exploratory analysis of missing data, and selecting variables important for predicting missingness. BRT models can show how values of other variables influence missingness, which may prove useful for researchers. Conclusion Researchers are encouraged to use CART and BRT models to explore and understand missing data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Public buildings and large infrastructure are typically monitored by tens or hundreds of cameras, all capturing different physical spaces and observing different types of interactions and behaviours. However to date, in large part due to limited data availability, crowd monitoring and operational surveillance research has focused on single camera scenarios which are not representative of real-world applications. In this paper we present a new, publicly available database for large scale crowd surveillance. Footage from 12 cameras for a full work day covering the main floor of a busy university campus building, including an internal and external foyer, elevator foyers, and the main external approach are provided; alongside annotation for crowd counting (single or multi-camera) and pedestrian flow analysis for 10 and 6 sites respectively. We describe how this large dataset can be used to perform distributed monitoring of building utilisation, and demonstrate the potential of this dataset to understand and learn the relationship between different areas of a building.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In structural brain MRI, group differences or changes in brain structures can be detected using Tensor-Based Morphometry (TBM). This method consists of two steps: (1) a non-linear registration step, that aligns all of the images to a common template, and (2) a subsequent statistical analysis. The numerous registration methods that have recently been developed differ in their detection sensitivity when used for TBM, and detection power is paramount in epidemological studies or drug trials. We therefore developed a new fluid registration method that computes the mappings and performs statistics on them in a consistent way, providing a bridge between TBM registration and statistics. We used the Log-Euclidean framework to define a new regularizer that is a fluid extension of the Riemannian elasticity, which assures diffeomorphic transformations. This regularizer constrains the symmetrized Jacobian matrix, also called the deformation tensor. We applied our method to an MRI dataset from 40 fraternal and identical twins, to revealed voxelwise measures of average volumetric differences in brain structure for subjects with different degrees of genetic resemblance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we develop and validate a new Statistically Assisted Fluid Registration Algorithm (SAFIRA) for brain images. A non-statistical version of this algorithm was first implemented in [2] and re-formulated using Lagrangian mechanics in [3]. Here we extend this algorithm to 3D: given 3D brain images from a population, vector fields and their corresponding deformation matrices are computed in a first round of registrations using the non-statistical implementation. Covariance matrices for both the deformation matrices and the vector fields are then obtained and incorporated (separately or jointly) in the regularizing (i.e., the non-conservative Lagrangian) terms, creating four versions of the algorithm. We evaluated the accuracy of each algorithm variant using the manually labeled LPBA40 dataset, which provides us with ground truth anatomical segmentations. We also compared the power of the different algorithms using tensor-based morphometry -a technique to analyze local volumetric differences in brain structure- applied to 46 3D brain scans from healthy monozygotic twins.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modern non-invasive brain imaging technologies, such as diffusion weighted magnetic resonance imaging (DWI), enable the mapping of neural fiber tracts in the white matter, providing a basis to reconstruct a detailed map of brain structural connectivity networks. Brain connectivity networks differ from random networks in their topology, which can be measured using small worldness, modularity, and high-degree nodes (hubs). Still, little is known about how individual differences in structural brain network properties relate to age, sex, or genetic differences. Recently, some groups have reported brain network biomarkers that enable differentiation among individuals, pairs of individuals, and groups of individuals. In addition to studying new topological features, here we provide a unifying general method to investigate topological brain networks and connectivity differences between individuals, pairs of individuals, and groups of individuals at several levels of the data hierarchy, while appropriately controlling false discovery rate (FDR) errors. We apply our new method to a large dataset of high quality brain connectivity networks obtained from High Angular Resolution Diffusion Imaging (HARDI) tractography in 303 young adult twins, siblings, and unrelated people. Our proposed approach can accurately classify brain connectivity networks based on sex (93% accuracy) and kinship (88.5% accuracy). We find statistically significant differences associated with sex and kinship both in the brain connectivity networks and in derived topological metrics, such as the clustering coefficient and the communicability matrix.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The SNP-SNP interactome has rarely been explored in the context of neuroimaging genetics mainly due to the complexity of conducting approximately 10(11) pairwise statistical tests. However, recent advances in machine learning, specifically the iterative sure independence screening (SIS) method, have enabled the analysis of datasets where the number of predictors is much larger than the number of observations. Using an implementation of the SIS algorithm (called EPISIS), we used exhaustive search of the genome-wide, SNP-SNP interactome to identify and prioritize SNPs for interaction analysis. We identified a significant SNP pair, rs1345203 and rs1213205, associated with temporal lobe volume. We further examined the full-brain, voxelwise effects of the interaction in the ADNI dataset and separately in an independent dataset of healthy twins (QTIM). We found that each additional loading in the epistatic effect was associated with approximately 5% greater brain regional brain volume (a protective effect) in both the ADNI and QTIM samples.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To understand factors that affect brain connectivity and integrity, it is beneficial to automatically cluster white matter (WM) fibers into anatomically recognizable tracts. Whole brain tractography, based on diffusion-weighted MRI, generates vast sets of fibers throughout the brain; clustering them into consistent and recognizable bundles can be difficult as there are wide individual variations in the trajectory and shape of WM pathways. Here we introduce a novel automated tract clustering algorithm based on label fusion - a concept from traditional intensity-based segmentation. Streamline tractography generates many incorrect fibers, so our top-down approach extracts tracts consistent with known anatomy, by mapping multiple hand-labeled atlases into a new dataset. We fuse clustering results from different atlases, using a mean distance fusion scheme. We reliably extracted the major tracts from 105-gradient high angular resolution diffusion images (HARDI) of 198 young normal twins. To compute population statistics, we use a pointwise correspondence method to match, compare, and average WM tracts across subjects. We illustrate our method in a genetic study of white matter tract heritability in twins.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a new algorithm to compute the voxel-wise genetic contribution to brain fiber microstructure using diffusion tensor imaging (DTI) in a dataset of 25 monozygotic (MZ) twins and 25 dizygotic (DZ) twin pairs (100 subjects total). First, the structural and DT scans were linearly co-registered. Structural MR scans were nonlinearly mapped via a 3D fluid transformation to a geometrically centered mean template, and the deformation fields were applied to the DTI volumes. After tensor re-orientation to realign them to the anatomy, we computed several scalar and multivariate DT-derived measures including the geodesic anisotropy (GA), the tensor eigenvalues and the full diffusion tensors. A covariance-weighted distance was measured between twins in the Log-Euclidean framework [2], and used as input to a maximum-likelihood based algorithm to compute the contributions from genetics (A), common environmental factors (C) and unique environmental ones (E) to fiber architecture. Quanititative genetic studies can take advantage of the full information in the diffusion tensor, using covariance weighted distances and statistics on the tensor manifold.