55 resultados para Automated data analysis
Resumo:
Android is becoming ubiquitous and currently has the largest share of the mobile OS market with billions of application downloads from the official app market. It has also become the platform most targeted by mobile malware that are becoming more sophisticated to evade state-of-the-art detection approaches. Many Android malware families employ obfuscation techniques in order to avoid detection and this may defeat static analysis based approaches. Dynamic analysis on the other hand may be used to overcome this limitation. Hence in this paper we propose DynaLog, a dynamic analysis based framework for characterizing Android applications. The framework provides the capability to analyse the behaviour of applications based on an extensive number of dynamic features. It provides an automated platform for mass analysis and characterization of apps that is useful for quickly identifying and isolating malicious applications. The DynaLog framework leverages existing open source tools to extract and log high level behaviours, API calls, and critical events that can be used to explore the characteristics of an application, thus providing an extensible dynamic analysis platform for detecting Android malware. DynaLog is evaluated using real malware samples and clean applications demonstrating its capabilities for effective analysis and detection of malicious applications.
Resumo:
The results of a study aimed at determining the most important experimental parameters for automated, quantitative analysis of solid dosage form pharmaceuticals (seized and model 'ecstasy' tablets) are reported. Data obtained with a macro-Raman spectrometer were complemented by micro-Raman measurements, which gave information on particle size and provided excellent data for developing statistical models of the sampling errors associated with collecting data as a series of grid points on the tablets' surface. Spectra recorded at single points on the surface of seized MDMA-caffeine-lactose tablets with a Raman microscope (lambda(ex) = 785 nm, 3 mum diameter spot) were typically dominated by one or other of the three components, consistent with Raman mapping data which showed the drug and caffeine microcrystals were ca 40 mum in diameter. Spectra collected with a microscope from eight points on a 200 mum grid were combined and in the resultant spectra the average value of the Raman band intensity ratio used to quantify the MDMA: caffeine ratio, mu(r), was 1.19 with an unacceptably high standard deviation, sigma(r), of 1.20. In contrast, with a conventional macro-Raman system (150 mum spot diameter), combined eight grid point data gave mu(r) = 1.47 with sigma(r) = 0.16. A simple statistical model which could be used to predict sigma(r) under the various conditions used was developed. The model showed that the decrease in sigma(r) on moving to a 150 mum spot was too large to be due entirely to the increased spot diameter but was consistent with the increased sampling volume that arose from a combination of the larger spot size and depth of focus in the macroscopic system. With the macro-Raman system, combining 64 grid points (0.5 mm spacing and 1-2 s accumulation per point) to give a single averaged spectrum for a tablet was found to be a practical balance between minimizing sampling errors and keeping overhead times at an acceptable level. The effectiveness of this sampling strategy was also tested by quantitative analysis of a set of model ecstasy tablets prepared from MDEA-sorbitol (0-30% by mass MDEA). A simple univariate calibration model of averaged 64 point data had R-2 = 0.998 and an r.m.s. standard error of prediction of 1.1% whereas data obtained by sampling just four points on the same tablet showed deviations from the calibration of up to 5%.
Resumo:
A new method for automated coronal loop tracking, in both spatial and temporal domains, is presented. Applying this technique to TRACE data, obtained using the 171 angstrom filter on 1998 July 14, we detect a coronal loop undergoing a 270 s kink-mode oscillation, as previously found by Aschwanden et al. However, we also detect flare-induced, and previously unnoticed, spatial periodicities on a scale of 3500 km, which occur along the coronal loop edge. Furthermore, we establish a reduction in oscillatory power for these spatial periodicities of 45% over a 222 s interval. We relate the reduction in detected oscillatory power to the physical damping of these loop-top oscillations.
Resumo:
Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.
Resumo:
Objective: Molecular pathology relies on identifying anomalies using PCR or analysis of DNA/RNA. This is important in solid tumours where molecular stratification of patients define targeted treatment. These molecular biomarkers rely on examination of tumour, annotation for possible macro dissection/tumour cell enrichment and the estimation of % tumour. Manually marking up tumour is error prone. Method: We have developed a method for automated tumour mark-up and % cell calculations using image analysis called TissueMark® based on texture analysis for lung, colorectal and breast (cases=245, 100, 100 respectively). Pathologists marked slides for tumour and reviewed the automated analysis. A subset of slides was manually counted for tumour cells to provide a benchmark for automated image analysis. Results: There was a strong concordance between pathological and automated mark-up (100 % acceptance rate for macro-dissection). We also showed a strong concordance between manually/automatic drawn boundaries (median exclusion/inclusion error of 91.70 %/89 %). EGFR mutation analysis was precisely the same for manual and automated annotation-based macrodissection. The annotation accuracy rates in breast and colorectal cancer were 83 and 80 % respectively. Finally, region-based estimations of tumour percentage using image analysis showed significant correlation with actual cell counts. Conclusion: Image analysis can be used for macro-dissection to (i) annotate tissue for tumour and (ii) estimate the % tumour cells and represents an approach to standardising/improving molecular diagnostics.
Resumo:
Retrospective clinical datasets are often characterized by a relatively small sample size and many missing data. In this case, a common way for handling the missingness consists in discarding from the analysis patients with missing covariates, further reducing the sample size. Alternatively, if the mechanism that generated the missing allows, incomplete data can be imputed on the basis of the observed data, avoiding the reduction of the sample size and allowing methods to deal with complete data later on. Moreover, methodologies for data imputation might depend on the particular purpose and might achieve better results by considering specific characteristics of the domain. The problem of missing data treatment is studied in the context of survival tree analysis for the estimation of a prognostic patient stratification. Survival tree methods usually address this problem by using surrogate splits, that is, splitting rules that use other variables yielding similar results to the original ones. Instead, our methodology consists in modeling the dependencies among the clinical variables with a Bayesian network, which is then used to perform data imputation, thus allowing the survival tree to be applied on the completed dataset. The Bayesian network is directly learned from the incomplete data using a structural expectation–maximization (EM) procedure in which the maximization step is performed with an exact anytime method, so that the only source of approximation is due to the EM formulation itself. On both simulated and real data, our proposed methodology usually outperformed several existing methods for data imputation and the imputation so obtained improved the stratification estimated by the survival tree (especially with respect to using surrogate splits).
Resumo:
The predominant fear in capital markets is that of a price spike. Commodity markets differ in that there is a fear of both upward and down jumps, this results in implied volatility curves displaying distinct shapes when compared to equity markets. The use of a novel functional data analysis (FDA) approach, provides a framework to produce and interpret functional objects that characterise the underlying dynamics of oil future options. We use the FDA framework to examine implied volatility, jump risk, and pricing dynamics within crude oil markets. Examining a WTI crude oil sample for the 2007–2013 period, which includes the global financial crisis and the Arab Spring, strong evidence is found of converse jump dynamics during periods of demand and supply side weakness. This is used as a basis for an FDA-derived Merton (1976) jump diffusion optimised delta hedging strategy, which exhibits superior portfolio management results over traditional methods.
Resumo:
Context. The Public European Southern Observatory Spectroscopic Survey of Transient Objects (PESSTO) began as a public spectroscopic survey in April 2012. PESSTO classifies transients from publicly available sources and wide-field surveys, and selects science targets for detailed spectroscopic and photometric follow-up. PESSTO runs for nine months of the year, January - April and August - December inclusive, and typically has allocations of 10 nights per month.
Aims. We describe the data reduction strategy and data products that are publicly available through the ESO archive as the Spectroscopic Survey data release 1 (SSDR1).
Methods. PESSTO uses the New Technology Telescope with the instruments EFOSC2 and SOFI to provide optical and NIR spectroscopy and imaging. We target supernovae and optical transients brighter than 20.5<sup>m</sup> for classification. Science targets are selected for follow-up based on the PESSTO science goal of extending knowledge of the extremes of the supernova population. We use standard EFOSC2 set-ups providing spectra with resolutions of 13-18 Å between 3345-9995 Å. A subset of the brighter science targets are selected for SOFI spectroscopy with the blue and red grisms (0.935-2.53 μm and resolutions 23-33 Å) and imaging with broadband JHK<inf>s</inf> filters.
Results. This first data release (SSDR1) contains flux calibrated spectra from the first year (April 2012-2013). A total of 221 confirmed supernovae were classified, and we released calibrated optical spectra and classifications publicly within 24 h of the data being taken (via WISeREP). The data in SSDR1 replace those released spectra. They have more reliable and quantifiable flux calibrations, correction for telluric absorption, and are made available in standard ESO Phase 3 formats. We estimate the absolute accuracy of the flux calibrations for EFOSC2 across the whole survey in SSDR1 to be typically ∼15%, although a number of spectra will have less reliable absolute flux calibration because of weather and slit losses. Acquisition images for each spectrum are available which, in principle, can allow the user to refine the absolute flux calibration. The standard NIR reduction process does not produce high accuracy absolute spectrophotometry but synthetic photometry with accompanying JHK<inf>s</inf> imaging can improve this. Whenever possible, reduced SOFI images are provided to allow this.
Conclusions. Future data releases will focus on improving the automated flux calibration of the data products. The rapid turnaround between discovery and classification and access to reliable pipeline processed data products has allowed early science papers in the first few months of the survey.
Resumo:
A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.
Resumo:
Purpose – The purpose of this paper is to present an analysis of media representation of business ethics within 62 international newspapers to explore the longitudinal and contextual evolution of business ethics and associated terminology. Levels of coverage and contextual analysis of the content of the articles are used as surrogate measures of the penetration of business ethics concepts into society. Design/methodology/approach – This paper uses a text mining application based on two samples of data: analysis of 62 national newspapers in 21 countries from 1990 to 2008; analysis of the content of two samples of articles containing the term business ethics (comprised of 100 newspaper articles spread over an 18-year period from a sample of US and UK newspapers). Findings – The paper demonstrates increased coverage of sustainability topics within the media over the last 18 years associated with events such as the Rio Summit. Whilst some peaks are associated with business ethics scandals, the overall coverage remains steady. There is little apparent use in the media of concepts such as corporate citizenship. The academic community and company ethical codes appear to adopt a wider definition of business ethics more akin to that associated with sustainability, in comparison with the focus taken by the media, especially in the USA. Coverage demonstrates clear regional bias and contextual analysis of the articles in the UK and USA also shows interesting parallels and divergences in the media representation of business ethics. Originality/value – A promising avenue to explore how the evolution of sustainability issues including business ethics can be tracked within a societal context.
Resumo:
The implementation of effective time analysis methods fast and accurately in the era of digital manufacturing has become a significant challenge for aerospace manufacturers hoping to build and maintain a competitive advantage. This paper proposes a structure oriented, knowledge-based approach for intelligent time analysis of aircraft assembly processes within a digital manufacturing framework. A knowledge system is developed so that the design knowledge can be intelligently retrieved for implementing assembly time analysis automatically. A time estimation method based on MOST, is reviewed and employed. Knowledge capture, transfer and storage within the digital manufacturing environment are extensively discussed. Configured plantypes, GUIs and functional modules are designed and developed for the automated time analysis. An exemplar study using an aircraft panel assembly from a regional jet is also presented. Although the method currently focuses on aircraft assembly, it can also be well utilized in other industry sectors, such as transportation, automobile and shipbuilding. The main contribution of the work is to present a methodology that facilitates the integration of time analysis with design and manufacturing using a digital manufacturing platform solution.
Resumo:
Background - The study of corneal endothelium, by specular microscopy, in patients with anterior uveitis has largely been restricted to observations on the endothelial cells. In this prospective study 'keratic precipitates' (KP) in different types of uveitis were examined in different stages of the disease process and the endothelial changes occurring in the vicinity of the KP were evaluated in comparison with the endothelium of the uninvolved eye. Methods - 13 patients with active unilateral uveitis were recruited. The mean age was 42.9 years (range 20-76 years). A Tomey-1100 contact wide field specular (x10) microscope was used to capture endothelial images and KP until the resolution of uveitis. Data regarding type of uveitis, number, size, and nature of KP were recorded. Automated morphometric analysis was done for cell size, cell density and coefficient of variation, and statistical comparisons of cell size and cell density were made (Student's t test) between the endothelium in the vicinity of fresh and resolving KP, fresh KP and normal endothelium, and resolving KP and normal endothelium. Results - On specular microscopy, fresh KP were seen as dense, white glistening deposits occupying 5-10 endothelial cells in diameter and fine KP were widely distributed and were one or two endothelial cells in diameter. The KP in Posner-Schlossman syndrome had a distinct and different morphology. With clinical remission of uveitis, the KP were observed to undergo characteristic morphological changes and old KP demonstrated a large, dark halo surrounding a central white deposit and occasionally a dark shadow or a 'lacuna' replaced the site of the original KP. Endothelial blebs were noted as dark shadows or defects in the endothelial mosaic in patients with recurrent uveitis. There was significant statistical difference in the mean cell size and cell density of endothelial cells in the vicinity of fresh KP compared with normal endothelium of the opposite eye. Conclusion - This study elucidated the different specular microscopic features of KP in anterior uveitis. Distinct morphological features of large and fine KP were noted. These features underwent dramatic changes on resolution of uveitis. The endothelium was abnormal in the vicinity of KP, which returned to near normal values on resolution of uveitis.
Resumo:
Corneal endothelial cells from normal and traumatized human, primate, cat and rabbit eyes were studied by specular microscopy. Morphometric analysis was performed on micrographs of corneal endothelium using a semi-automated image analysis system. The results showed that under normal conditions the corneal endothelium of all four species exhibit major morphological similarities (mean cell areas: human 317 +/- 32 microns 2, primate 246 +/- 22 microns2, cat 357 +/- 25 microns 2, rabbit 308 +/- 35 microns 2). The normal corneal endothelium in man was found to be more polymegethous than that of the other species. Trauma to cat, primate and human corneas resulted in a long-term reduction in endothelial cell density and enhanced polymegethism. In contrast, the reparative response of the rabbit ensured the reformation of an essentially normal monolayer following injury. Endothelial giant cells were a normal inclusion in the rabbit corneal endothelium but were only significant in cat, primate and man following trauma. The presence of corneal endothelial giant cells in amitotic corneas may therefore represent a compensatory response in the absence of mitotic potential.
Resumo:
Context. Comet 67P/Churyumov-Gerasimenko is the target of the European Space Agency Rosetta spacecraft rendez-vous mission. Detailed physical characteristation of the comet before arrival is important for mission planning as well as providing a test bed for ground-based observing and data-analysis methods. Aims: To conduct a long-term observational programme to characterize the physical properties of the nucleus of the comet, via ground-based optical photometry, and to combine our new data with all available nucleus data from the literature. Methods: We applied aperture photometry techniques on our imaging data and combined the extracted rotational lightcurves with data from the literature. Optical lightcurve inversion techniques were applied to constrain the spin state of the nucleus and its broad shape. We performed a detailed surface thermal analysis with the shape model and optical photometry by incorporating both into the new Advanced Thermophysical Model (ATPM), along with all available Spitzer 8-24 μm thermal-IR flux measurements from the literature. Results: A convex triangular-facet shape model was determined with axial ratios b/a = 1.239 and c/a = 0.819. These values can vary by as much as 7% in each axis and still result in a statistically significant fit to the observational data. Our best spin state solution has Psid = 12.76137 ± 0.00006 h, and a rotational pole orientated at Ecliptic coordinates λ = 78°(±10°), β = + 58°(±10°). The nucleus phase darkening behaviour was measured and best characterized using the IAU HG system. Best fit parameters are: G = 0.11 ± 0.12 and HR(1,1,0) = 15.31 ± 0.07. Our shape model combined with the ATPM can satisfactorily reconcile all optical and thermal-IR data, with the fit to the Spitzer 24 μm data taken in February 2004 being exceptionally good. We derive a range of mutually-consistent physical parameters for each thermal-IR data set, including effective radius, geometric albedo, surface thermal inertia and roughness fraction. Conclusions: The overall nucleus dimensions are well constrained and strongly imply a broad nucleus shape more akin to comet 9P/Tempel 1, rather than the highly elongated or "bi-lobed" nuclei seen for comets 103P/Hartley 2 or 8P/Tuttle. The derived low thermal inertia of