960 resultados para Weighted Query
Resumo:
Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.
Resumo:
Homomorphic encryption is a particular type of encryption method that enables computing over encrypted data. This has a wide range of real world ramifications such as being able to blindly compute a search result sent to a remote server without revealing its content. In the first part of this thesis, we discuss how database search queries can be made secure using a homomorphic encryption scheme based on the ideas of Gahi et al. Gahi’s method is based on the integer-based fully homomorphic encryption scheme proposed by Dijk et al. We propose a new database search scheme called the Homomorphic Query Processing Scheme, which can be used with the ring-based fully homomorphic encryption scheme proposed by Braserski. In the second part of this thesis, we discuss the cybersecurity of the smart electric grid. Specifically, we use the Homomorphic Query Processing scheme to construct a keyword search technique in the smart grid. Our work is based on the Public Key Encryption with Keyword Search (PEKS) method introduced by Boneh et al. and a Multi-Key Homomorphic Encryption scheme proposed by L´opez-Alt et al. A summary of the results of this thesis (specifically the Homomorphic Query Processing Scheme) is published at the 14th Canadian Workshop on Information Theory (CWIT).
Resumo:
Aims: To investigate the use of diffusion weighted magnetic resonance imaging (DWI) and the apparent diffusion coefficient (ADC) values in the diagnosis of hemangioma. Materials and methods: The study population consisted of 72 patients with liver masses larger than 1 cm (72 focal lesions). DWI examination with a b value of 600 s/mm2 was carried out for all patients. After DWI examination, an ADC map was created and ADC values were measured for 72 liver masses and normal liver tissue (control group). The average ADC values of normal liver tissue and focal liver lesions, the “cut-off” ADC values, and the diagnostic sensitivity and specificity of the ADC map in diagnosing hemangioma, benign and malignant lesions were researched. Results: Of the 72 liver masses, 51 were benign and 21 were malignant. Benign lesions comprised 38 hemangiomas and 13 simple cysts. Malignant lesions comprised 9 hepatocellular carcinomas, and 12 metastases. The highest ADC values were measured for cysts (3.782±0.53×10-3 mm2/s) and hemangiomas (2.705±0.63×10-3 mm2/s). The average ADC value of hemangiomas was significantly higher than malignant lesions and the normal control group (p<0.001). The average ADC value of cysts were significantly higher when compared to hemangiomas and normal control group (p<0.001). To distinguish hemangiomas from malignant liver lesions, the “cut-off” ADC value of 1.800×10-3 mm2/s had a sensitivity of 97.4% and a specificity of 90.9%. To distinguish hemangioma from normal liver parenchyma the “cut-off” value of 1.858×10-3 mm2/s had a sensitivity of 97.4% and a specificity of 95.7%. To distinguish benign liver lesions from malignant liver lesions the “cut-off” value of 1.800×10-3 mm2/s had a sensitivity of 96.1% and a specificity of 90.0%. Conclusion: DWI and quantitative measurement of ADC values can be used in differential diagnosis of benign and malignant liver lesions and also in the diagnosis and differentiation of hemangiomas. When dynamic examination cannot distinguish cases with vascular metastasis and lesions from hemangioma, DWI and ADC values can be useful in the primary diagnosis and differential diagnosis. The technique does not require contrast material, so it can safely be used in patients with renal failure. Keywords:
Resumo:
PURPOSE We aimed to evaluate the added value of diffusion-weighted imaging (DWI) to standard magnetic resonance imaging (MRI) for detecting post-treatment cervical cancer recurrence. The detection accuracy of T2-weighted (T2W) images was compared with that of T2W MRI combined with either dynamic contrast-enhanced (DCE) MRI or DWI. METHODS Thirty-eight women with clinically suspected uterine cervical cancer recurrence more than six months after treatment completion were examined with 1.5 Tesla MRI including T2W, DCE, and DWI sequences. Disease was confirmed histologically and correlated with MRI findings. The diagnostic performance of T2W imaging and its combination with either DCE or DWI were analyzed. Sensitivity, positive predictive value, and accuracy were calculated. RESULTS Thirty-six women had histologically proven recurrence. The accuracy for recurrence detection was 80% with T2W/DCE MRI and 92.1% with T2W/DWI. The addition of DCE sequences did not significantly improve the diagnostic ability of T2W imaging, and this sequence combination misclassified two patients as falsely positive and seven as falsely negative. The T2W/DWI combination revealed a positive predictive value of 100% and only three false negatives. CONCLUSION The addition of DWI to T2W sequences considerably improved the diagnostic ability of MRI. Our results support the inclusion of DWI in the initial MRI protocol for the detection of cervical cancer recurrence, leaving DCE sequences as an option for uncertain cases.
Resumo:
International audience
Resumo:
Throughout the last years technologic improvements have enabled internet users to analyze and retrieve data regarding Internet searches. In several fields of study this data has been used. Some authors have been using search engine query data to forecast economic variables, to detect influenza areas or to demonstrate that it is possible to capture some patterns in stock markets indexes. In this paper one investment strategy is presented using Google Trends’ weekly query data from major global stock market indexes’ constituents. The results suggest that it is indeed possible to achieve higher Info Sharpe ratios, especially for the major European stock market indexes in comparison to those provided by a buy-and-hold strategy for the period considered.
Resumo:
This report discusses the calculation of analytic second-order bias techniques for the maximum likelihood estimates (for short, MLEs) of the unknown parameters of the distribution in quality and reliability analysis. It is well-known that the MLEs are widely used to estimate the unknown parameters of the probability distributions due to their various desirable properties; for example, the MLEs are asymptotically unbiased, consistent, and asymptotically normal. However, many of these properties depend on an extremely large sample sizes. Those properties, such as unbiasedness, may not be valid for small or even moderate sample sizes, which are more practical in real data applications. Therefore, some bias-corrected techniques for the MLEs are desired in practice, especially when the sample size is small. Two commonly used popular techniques to reduce the bias of the MLEs, are ‘preventive’ and ‘corrective’ approaches. They both can reduce the bias of the MLEs to order O(n−2), whereas the ‘preventive’ approach does not have an explicit closed form expression. Consequently, we mainly focus on the ‘corrective’ approach in this report. To illustrate the importance of the bias-correction in practice, we apply the bias-corrected method to two popular lifetime distributions: the inverse Lindley distribution and the weighted Lindley distribution. Numerical studies based on the two distributions show that the considered bias-corrected technique is highly recommended over other commonly used estimators without bias-correction. Therefore, special attention should be paid when we estimate the unknown parameters of the probability distributions under the scenario in which the sample size is small or moderate.
Resumo:
Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.
Resumo:
Conventional web search engines are centralised in that a single entity crawls and indexes the documents selected for future retrieval, and the relevance models used to determine which documents are relevant to a given user query. As a result, these search engines suffer from several technical drawbacks such as handling scale, timeliness and reliability, in addition to ethical concerns such as commercial manipulation and information censorship. Alleviating the need to rely entirely on a single entity, Peer-to-Peer (P2P) Information Retrieval (IR) has been proposed as a solution, as it distributes the functional components of a web search engine – from crawling and indexing documents, to query processing – across the network of users (or, peers) who use the search engine. This strategy for constructing an IR system poses several efficiency and effectiveness challenges which have been identified in past work. Accordingly, this thesis makes several contributions towards advancing the state of the art in P2P-IR effectiveness by improving the query processing and relevance scoring aspects of a P2P web search. Federated search systems are a form of distributed information retrieval model that route the user’s information need, formulated as a query, to distributed resources and merge the retrieved result lists into a final list. P2P-IR networks are one form of federated search in routing queries and merging result among participating peers. The query is propagated through disseminated nodes to hit the peers that are most likely to contain relevant documents, then the retrieved result lists are merged at different points along the path from the relevant peers to the query initializer (or namely, customer). However, query routing in P2P-IR networks is considered as one of the major challenges and critical part in P2P-IR networks; as the relevant peers might be lost in low-quality peer selection while executing the query routing, and inevitably lead to less effective retrieval results. This motivates this thesis to study and propose query routing techniques to improve retrieval quality in such networks. Cluster-based semi-structured P2P-IR networks exploit the cluster hypothesis to organise the peers into similar semantic clusters where each such semantic cluster is managed by super-peers. In this thesis, I construct three semi-structured P2P-IR models and examine their retrieval effectiveness. I also leverage the cluster centroids at the super-peer level as content representations gathered from cooperative peers to propose a query routing approach called Inverted PeerCluster Index (IPI) that simulates the conventional inverted index of the centralised corpus to organise the statistics of peers’ terms. The results show a competitive retrieval quality in comparison to baseline approaches. Furthermore, I study the applicability of using the conventional Information Retrieval models as peer selection approaches where each peer can be considered as a big document of documents. The experimental evaluation shows comparative and significant results and explains that document retrieval methods are very effective for peer selection that brings back the analogy between documents and peers. Additionally, Learning to Rank (LtR) algorithms are exploited to build a learned classifier for peer ranking at the super-peer level. The experiments show significant results with state-of-the-art resource selection methods and competitive results to corresponding classification-based approaches. Finally, I propose reputation-based query routing approaches that exploit the idea of providing feedback on a specific item in the social community networks and manage it for future decision-making. The system monitors users’ behaviours when they click or download documents from the final ranked list as implicit feedback and mines the given information to build a reputation-based data structure. The data structure is used to score peers and then rank them for query routing. I conduct a set of experiments to cover various scenarios including noisy feedback information (i.e, providing positive feedback on non-relevant documents) to examine the robustness of reputation-based approaches. The empirical evaluation shows significant results in almost all measurement metrics with approximate improvement more than 56% compared to baseline approaches. Thus, based on the results, if one were to choose one technique, reputation-based approaches are clearly the natural choices which also can be deployed on any P2P network.
Resumo:
The introduction of molecular criteria into the classification of diffuse gliomas has added interesting practical implications to glioma management. This has created a new clinical need for correlating imaging characteristics with glioma genotypes, also known as radiogenomics or imaging genomics. Whilst many studies have primarily focused on the use of advanced magnetic resonance imaging (MRI) techniques for radiogenomics purposes, conventional MRI sequences still remain the reference point in the study and characterization of brain tumours. Moreover, a different approach may rely on diffusion-weighted imaging (DWI) usage, which is considered a “conventional” sequence in line with recently published directions on glioma imaging. In a non-invasive way, it can provide direct insight into the microscopic physical properties of tissues. Considering that Isocitrate-Dehydrogenase gene mutations may reflect alterations in metabolism, cellularity, and angiogenesis, which may manifest characteristic features on an MRI, the identification of specific MRI biomarkers could be of great interest in managing patients with brain gliomas. My study aimed to evaluate the presence of specific MRI-derived biomarkers of IDH molecular status through conventional MRI and DWI sequences.
Resumo:
During the last semester of the Master’s Degree in Artificial Intelligence, I carried out my internship working for TXT e-Solution on the ADMITTED project. This paper describes the work done in those months. The thesis will be divided into two parts representing the two different tasks I was assigned during the course of my experience. The First part will be about the introduction of the project and the work done on the admittedly library, maintaining the code base and writing the test suits. The work carried out is more connected to the Software engineer role, developing features, fixing bugs and testing. The second part will describe the experiments done on the Anomaly detection task using a Deep Learning technique called Autoencoder, this task is on the other hand more connected to the data science role. The two tasks were not done simultaneously but were dealt with one after the other, which is why I preferred to divide them into two separate parts of this paper.
Resumo:
Our objective was to investigate spinal cord (SC) atrophy in amyotrophic lateral sclerosis (ALS) patients, and to determine whether it correlates with clinical parameters. Forty-three patients with ALS (25 males) and 43 age- and gender-matched healthy controls underwent MRI on a 3T scanner. We used T1-weighted 3D images covering the whole brain and the cervical SC to estimate cervical SC area and eccentricity at C2/C3 level using validated software (SpineSeg). Disease severity was quantified with the ALSFRS-R and ALS Severity scores. SC areas of patients and controls were compared with a Mann-Whitney test. We used linear regression to investigate association between SC area and clinical parameters. Results showed that mean age of patients and disease duration were 53.1 ± 12.2 years and 34.0 ± 29.8 months, respectively. The two groups were significantly different regarding SC areas (67.8 ± 6.8 mm² vs. 59.5 ± 8.4 mm², p < 0.001). Eccentricity values were similar in both groups (p = 0.394). SC areas correlated with disease duration (r = - 0.585, p < 0.001), ALSFRS-R score (r = 0.309, p = 0.044) and ALS Severity scale (r = 0.347, p = 0.022). In conclusion, patients with ALS have SC atrophy, but no flattening. In addition, SC areas correlated with disease duration and functional status. These data suggest that quantitative MRI of the SC may be a useful biomarker in the disease.
Resumo:
Primary craniocervical dystonia (CCD) is generally attributed to functional abnormalities in the cortico-striato-pallido-thalamocortical loops, but cerebellar pathways have also been implicated in neuroimaging studies. Hence, our purpose was to perform a volumetric evaluation of the infratentorial structures in CCD. We compared 35 DYT1/DYT6 negative patients with CCD and 35 healthy controls. Cerebellar volume was evaluated using manual volumetry (DISPLAY software) and infratentorial volume by voxel based morphometry of gray matter (GM) segments derived from T1 weighted 3 T MRI using the SUIT tool (SPM8/Dartel). We used t-tests to compare infratentorial volumes between groups. Cerebellar volume was (1.14 ± 0.17) × 10(2) cm(3) for controls and (1.13 ± 0.14) × 10(2) cm(3) for patients; p = 0.74. VBM demonstrated GM increase in the left I-IV cerebellar lobules and GM decrease in the left lobules VI and Crus I and in the right lobules VI, Crus I and VIIIb. In a secondary analysis, VBM demonstrated GM increase also in the brainstem, mostly in the pons. While gray matter increase is observed in the anterior lobe of the cerebellum and in the brainstem, the atrophy is concentrated in the posterior lobe of the cerebellum, demonstrating a differential pattern of infratentorial involvement in CCD. This study shows subtle structural abnormalities of the cerebellum and brainstem in primary CCD.