5 resultados para Precision-recall analysis

em AMS Tesi di Laurea - Alm@DL - Universit


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Internet traffic classification is a relevant and mature research field, anyway of growing importance and with still open technical challenges, also due to the pervasive presence of Internet-connected devices into everyday life. We claim the need for innovative traffic classification solutions capable of being lightweight, of adopting a domain-based approach, of not only concentrating on application-level protocol categorization but also classifying Internet traffic by subject. To this purpose, this paper originally proposes a classification solution that leverages domain name information extracted from IPFIX summaries, DNS logs, and DHCP leases, with the possibility to be applied to any kind of traffic. Our proposed solution is based on an extension of Word2vec unsupervised learning techniques running on a specialized Apache Spark cluster. In particular, learning techniques are leveraged to generate word-embeddings from a mixed dataset composed by domain names and natural language corpuses in a lightweight way and with general applicability. The paper also reports lessons learnt from our implementation and deployment experience that demonstrates that our solution can process 5500 IPFIX summaries per second on an Apache Spark cluster with 1 slave instance in Amazon EC2 at a cost of $ 3860 year. Reported experimental results about Precision, Recall, F-Measure, Accuracy, and Cohen's Kappa show the feasibility and effectiveness of the proposal. The experiments prove that words contained in domain names do have a relation with the kind of traffic directed towards them, therefore using specifically trained word embeddings we are able to classify them in customizable categories. We also show that training word embeddings on larger natural language corpuses leads improvements in terms of precision up to 180%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Radial velocities measured from near-infrared (NIR) spectra are a potential tool to search for extrasolar planets around cool stars. High resolution infrared spectrographs now available reach the high precision of visible instruments, with a constant improvement over time. GIANO is an infrared echelle spectrograph and it is a powerful tool to provide high resolution spectra for accurate radial velocity measurements of exo-planets and for chemical and dynamical studies of stellar or extragalactic objects. No other IR instruments have the GIANO's capability to cover the entire NIR wavelength range. In this work we develop an ensemble of IDL procedures to measure high precision radial velocities on a few GIANO spectra acquired during the commissioning run, using the telluric lines as wevelength reference. In Section 1.1 various exoplanet search methods are described. They exploit different properties of the planetary system. In Section 1.2 we describe the exoplanet population discovered trough the different methods. In Section 1.3 we explain motivations for NIR radial velocities and the challenges related the main issue that has limited the pursuit of high-precision NIR radial velocity, that is, the lack of a suitable calibration method. We briefly describe calibration methods in the visible and the solutions for IR calibration, for instance, the use of telluric lines. The latter has advantages and problems, described in detail. In this work we use telluric lines as wavelength reference. In Section 1.4 the Cross Correlation Function (CCF) method is described. This method is widely used to measure the radial velocities.In Section 1.5 we describe GIANO and its main science targets. In Chapter 2 observational data obtained with GIANO spectrograph are presented and the choice criteria are reported. In Chapter 3 we describe the detail of the analysis and examine in depth the flow chart reported in Section 3.1. In Chapter 4 we give the radial velocities measured with our IDL procedure for all available targets. We obtain an rms scatter in radial velocities of about 7 m/s. Finally, we conclude that GIANO can be used to measure radial velocities of late type stars with an accuracy close to or better than 10 m/s, using telluric lines as wevelength reference. In 2014 September GIANO is being operative at TNG for Science Verification and more observational data will allow to further refine this analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cyanoacetylene HC3N is a molecule of great astronomical importance and it has been observed in many interstellar environments. Its deuterated form DC3N has been detected in number of sources from external galaxies to Galactic interstellar clouds, star-forming regions and planetary atmospheres. All these detections relied on previous laboratory investigations, which however still lack some essential information concerning its infrared spectrum. In this project, high-resolution ro-vibrational spectra of DC3N have been recorded in two energy regions: 150 – 450 cm-1 and 1800 – 2800 cm-1. In the first window the ν7← GS, 2ν7 ← ν7, ν5 ← ν7, ν5+ν7 ← 2ν7, ν6+ν7 → 2v7, 4ν7 ← 2ν7 bands have been assigned, while in the second region the three stretching fundamental bands ν1, ν2, ν3 have been observed and analysed. The 150 – 450 cm-1 region spectra have been recorded at the AILES beamline at the SOLEIL synchrotron (France), the 1800 – 2800 cm-1 spectra at the Department of Industrial Chemistry “Toso Montanari” in Bologna. In total, 2299 transitions have been assigned. Such experimental transition, together with data previously recorded for DC3N, were included in a least-squares fitting procedure from which several spectroscopic parameters have been determined with high precision and accuracy. They include rotational, vibrational and resonance constants. The spectroscopic data of DC3N have been included in a line catalog for this molecule in order to assist future astronomical observations and data interpretation. A paper which includes this research work has been published (M. Melosso, L. Bizzocchi, A. Adamczyk, E. Cane, P. Caselli, L. Colzid, L. Dorea, B. M. Giulianob, J.-C. Guillemine, M-A. Martin-Drumel, O. Piralif, A. Pietropolli Charmet , D. Prudenzano, V. M. Rivillad, F. Tamassia, Extensive ro-vibrational analysis of deuterated-cyanoacetylene (DC3N) from millimeter wavelengths to the infrared domain, Jour. of Quant. Spectr. and Rad. Tran. 254, 107221, 2020).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rail transportation has significant importance in the future world. This importance is tightly bounded to accessible, sustainable, efficient and safe railway systems. Precise positioning in railway applications is essential for increasing railway traffic, train-track control, collision avoidance, train management and autonomous train driving. Hence, precise train positioning is a safety-critical application. Nowadays, positioning in railway applications highly depends on a cellular-based system called GSM-R, a railway-specific version of Global System for Mobile Communications (GSM). However, GSM-R is a relatively outdated technology and does not provide enough capacity and precision demanded by future railway networks. One option for positioning is mounting Global Navigation Satellite System (GNSS) receivers on trains as a low-cost solution. Nevertheless, GNSS can not provide continuous service due to signal interruption by harsh environments, tunnels etc. Another option is exploiting cellular-based positioning methods. The most recent cellular technology, 5G, provides high network capacity, low latency, high accuracy and high availability suitable for train positioning. In this thesis, an approach to 5G-based positioning for railway systems is discussed and simulated. Observed Time Difference of Arrival (OTDOA) method and 5G Positioning Reference Signal (PRS) are used. Simulations run using MATLAB, based on existing code developed for 5G positioning by extending it for Non Line of Sight (NLOS) link detection and base station exclusion algorithms. Performance analysis for different configurations is completed. Results show that efficient NLOS detection improves positioning accuracy and implementing a base station exclusion algorithm helps for further increase.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis contributes to the ArgMining 2021 shared task on Key Point Analysis. Key Point Analysis entails extracting and calculating the prevalence of a concise list of the most prominent talking points, from an input corpus. These talking points are usually referred to as key points. Key point analysis is divided into two subtasks: Key Point Matching, which involves assigning a matching score to each key point/argument pair, and Key Point Generation, which consists of the generation of key points. The task of Key Point Matching was approached using different models: a pretrained Sentence Transformers model and a tree-constrained Graph Neural Network were tested. The best model was the fine-tuned Sentence Transformers, which achieved a mean Average Precision score of 0.75, ranking 12 compared to other participating teams. The model was then used for the subtask of Key Point Generation using the extractive method in the selection of key point candidates and the model developed for the previous subtask to evaluate them.