997 resultados para Source codes
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.
Resumo:
Surface heat treatment in glasses and ceramics, using CO(2) lasers, has attracted the attention of several researchers around the world due to its impact in technological applications, such as lab-on-a-chip devices, diffraction gratings and microlenses. Microlens fabrication on a glass surface has been studied mainly due to its importance in optical devices (fiber coupling, CCD signal enhancement, etc). The goal of this work is to present a systematic study of the conditions for microlens fabrications, along with the viability of using microlens arrays, recorded on the glass surface, as bidimensional codes for product identification. This would allow the production of codes without any residues (like the fine powder generated by laser ablation) and resistance to an aggressive environment, such as sterilization processes. The microlens arrays were fabricated using a continuous wave CO(2) laser, focused on the surface of flat commercial soda-lime silicate glass substrates. The fabrication conditions were studied based on laser power, heating time and microlens profiles. A He-Ne laser was used as a light source in a qualitative experiment to test the viability of using the microlenses as bidimensional codes.
Resumo:
Background. The use of hospital discharge administrative data (HDAD) has been recommended for automating, improving, even substituting, population-based cancer registries. The frequency of false positive and false negative cases recommends local validation. Methods. The aim of this study was to detect newly diagnosed, false positive and false negative cases of cancer from hospital discharge claims, using four Spanish population-based cancer registries as the gold standard. Prostate cancer was used as a case study. Results. A total of 2286 incident cases of prostate cancer registered in 2000 were used for validation. In the most sensitive algorithm (that using five diagnostic codes), estimates for Sensitivity ranged from 14.5% (CI95% 10.3-19.6) to 45.7% (CI95% 41.4-50.1). In the most predictive algorithm (that using five diagnostic and five surgical codes) Positive Predictive Value estimates ranged from 55.9% (CI95% 42.4-68.8) to 74.3% (CI95% 67.0-80.6). The most frequent reason for false positive cases was the number of prevalent cases inadequately considered as newly diagnosed cancers, ranging from 61.1% to 82.3% of false positive cases. The most frequent reason for false negative cases was related to the number of cases not attended in hospital settings. In this case, figures ranged from 34.4% to 69.7% of false negative cases, in the most predictive algorithm. Conclusions. HDAD might be a helpful tool for cancer registries to reach their goals. The findings suggest that, for automating cancer registries, algorithms combining diagnoses and procedures are the best option. However, for cancer surveillance purposes, in those cancers like prostate cancer in which care is not only hospital-based, combining inpatient and outpatient information will be required.
Resumo:
Background: Over the last two decades, mortality from coronary heart disease (CHD) and cerebrovascular disease (CVD) declined by about 30% in the European Union (EU). Design: We analyzed trends in CHD (X ICD codes: I20-I25) and CVD (X ICD codes: I60-I69) mortality in young adults (age 35-44 years) in the EU as a whole and in 12 selected European countries, over the period 1980-2007. Methods: Data were derived from the World Health Organization mortality database. With joinpoint regression analysis, we identified significant changes in trends and estimated average annual percent changes (AAPC). Results: CHD mortality rates at ages 35-44 years have decreased in both sexes since the 1980s for most countries, except for Russia (130/100,000 men and 24/100,000 women, in 2005-7). The lowest rates (around 9/100,000 men, 2/100,000 women) were in France, Italy and Sweden. In men, the steepest declines in mortality were in the Czech Republic (AAPC = -6.1%), the Netherlands (-5.2%), Poland (-4.5%), and England and Wales (-4.5%). Patterns were similar in women, though with appreciably lower rates. The AAPC in the EU was -3.3% for men (rate = 16.6/100,000 in 2005-7) and -2.1% for women (rate = 3.5/100,000). For CVD, Russian rates in 2005-7 were 40/100,000 men and 16/100,000 women, 5 to 10-fold higher than in most western European countries. The steepest declines were in the Czech Republic and Italy for men, in Sweden and the Czech Republic for women. The AAPC in the EU was -2.5% in both sexes, with steeper declines after the mid-late 1990s (rates = 6.4/100,000 men and 4.3/100,000 women in 2005-7). Conclusions: CHD and CVD mortality steadily declined in Europe, except in Russia, whose rates were 10 to 15-fold higher than those of France, Italy or Sweden. Hungary and Poland, and also Scotland, where CHD trends were less favourable than in other western European countries, also emerge as priorities for preventive interventions.
Resumo:
This paper presents the design and implementation of QRP, an open source proof-of-concept authentication system that uses a two-factorauthentication by combining a password and a camera-equipped mobile phone, acting as an authentication token. QRP is extremely secure asall the sensitive information stored and transmitted is encrypted, but it isalso an easy to use and cost-efficient solution. QRP is portable and can be used securely in untrusted computers. Finally, QRP is able to successfully authenticate even when the phone is offline.
Resumo:
There has been considerable interest in the climate impact of trends in stratospheric water vapor (SWV). However, the representation of the radiative properties of water vapor under stratospheric conditions remains poorly constrained across different radiation codes. This study examines the sensitivity of a detailed line-by-line (LBL) code, a Malkmus narrow-band model and two broadband GCM radiation codes to a uniform perturbation in SWV in the longwave spectral region. The choice of sampling rate in wave number space (Δν) in the LBL code is shown to be important for calculations of the instantaneous change in heating rate (ΔQ) and the instantaneous longwave radiative forcing (ΔFtrop). ΔQ varies by up to 50% for values of Δν spanning 5 orders of magnitude, and ΔFtrop varies by up to 10%. In the three less detailed codes, ΔQ differs by up to 45% at 100 hPa and 50% at 1 hPa compared to a LBL calculation. This causes differences of up to 70% in the equilibrium fixed dynamical heating temperature change due to the SWV perturbation. The stratosphere-adjusted radiative forcing differs by up to 96% across the less detailed codes. The results highlight an important source of uncertainty in quantifying and modeling the links between SWV trends and climate.
Resumo:
This thesis describes the developments of new models and toolkits for the orbit determination codes to support and improve the precise radio tracking experiments of the Cassini-Huygens mission, an interplanetary mission to study the Saturn system. The core of the orbit determination process is the comparison between observed observables and computed observables. Disturbances in either the observed or computed observables degrades the orbit determination process. Chapter 2 describes a detailed study of the numerical errors in the Doppler observables computed by NASA's ODP and MONTE, and ESA's AMFIN. A mathematical model of the numerical noise was developed and successfully validated analyzing against the Doppler observables computed by the ODP and MONTE, with typical relative errors smaller than 10%. The numerical noise proved to be, in general, an important source of noise in the orbit determination process and, in some conditions, it may becomes the dominant noise source. Three different approaches to reduce the numerical noise were proposed. Chapter 3 describes the development of the multiarc library, which allows to perform a multi-arc orbit determination with MONTE. The library was developed during the analysis of the Cassini radio science gravity experiments of the Saturn's satellite Rhea. Chapter 4 presents the estimation of the Rhea's gravity field obtained from a joint multi-arc analysis of Cassini R1 and R4 fly-bys, describing in details the spacecraft dynamical model used, the data selection and calibration procedure, and the analysis method followed. In particular, the approach of estimating the full unconstrained quadrupole gravity field was followed, obtaining a solution statistically not compatible with the condition of hydrostatic equilibrium. The solution proved to be stable and reliable. The normalized moment of inertia is in the range 0.37-0.4 indicating that Rhea's may be almost homogeneous, or at least characterized by a small degree of differentiation.
Resumo:
Growth codes are a subclass of Rateless codes that have found interesting applications in data dissemination problems. Compared to other Rateless and conventional channel codes, Growth codes show improved intermediate performance which is particularly useful in applications where partial data presents some utility. In this paper, we investigate the asymptotic performance of Growth codes using the Wormald method, which was proposed for studying the Peeling Decoder of LDPC and LDGM codes. Compared to previous works, the Wormald differential equations are set on nodes' perspective which enables a numerical solution to the computation of the expected asymptotic decoding performance of Growth codes. Our framework is appropriate for any class of Rateless codes that does not include a precoding step. We further study the performance of Growth codes with moderate and large size codeblocks through simulations and we use the generalized logistic function to model the decoding probability. We then exploit the decoding probability model in an illustrative application of Growth codes to error resilient video transmission. The video transmission problem is cast as a joint source and channel rate allocation problem that is shown to be convex with respect to the channel rate. This illustrative application permits to highlight the main advantage of Growth codes, namely improved performance in the intermediate loss region.
Resumo:
Caffeine has already been used as an indicator of anthropogenic impacts, especially the ones related to the disposal of sewage in water bodies. In this work, the presence of caffeine has been correlated with the estrogenic activity of water samples measured using the BLYES assay. After testing 96 surface water samples, it was concluded that caffeine can be used to prioritize samples to be tested for estrogenic activity in water quality programs evaluating emerging contaminants with endocrine disruptor activity.
Resumo:
SHED (stem cells from human exfoliated deciduous teeth) represent a population of postnatal stem cells capable of extensive proliferation and multipotential differentiation. Primary teeth may be an ideal source of postnatal stem cells to regenerate tooth structures and bone, and possibly to treat neural tissue injury or degenerative diseases. SHED are highly proliferative cells derived from an accessible tissue source, and therefore hold potential for providing enough cells for clinical applications. In this review, we describe the current knowledge about dental pulp stem cells and discuss tissue engineering approaches that use SHED to replace irreversibly inflamed or necrotic pulps with a healthy and functionally competent tissue that is capable of forming new dentin.
Resumo:
The Lattes platform is the major scientific information system maintained by the National Council for Scientific and Technological Development (CNPq). This platform allows to manage the curricular information of researchers and institutions working in Brazil based on the so called Lattes Curriculum. However, the public information is individually available for each researcher, not providing the automatic creation of reports of several scientific productions for research groups. It is thus difficult to extract and to summarize useful knowledge for medium to large size groups of researchers. This paper describes the design, implementation and experiences with scriptLattes: an open-source system to create academic reports of groups based on curricula of the Lattes Database. The scriptLattes system is composed by the following modules: (a) data selection, (b) data preprocessing, (c) redundancy treatment, (d) collaboration graph generation among group members, (e) research map generation based on geographical information, and (f) automatic report creation of bibliographical, technical and artistic production, and academic supervisions. The system has been extensively tested for a large variety of research groups of Brazilian institutions, and the generated reports have shown an alternative to easily extract knowledge from data in the context of Lattes platform. The source code, usage instructions and examples are available at http://scriptlattes.sourceforge.net/.
Resumo:
The heterometal alkoxide [FeCl{Ti2(OPr i)9}] (1) was employed as a single source precursor for the preparation of Fe/Ti oxides under inert atmosphere. Three different synthetic procedures were adopted in the processing of 1, either employing aqueous HNO3 or HCl solutions, or in the absence of mineral acids. Products were characterised by powder X-ray diffractometry, scanning electron microscopy combined with energy dispersive X-ray spectroscopy (SEM/EDS) and Raman, electron paramagnetic resonance (EPR) and Mössbauer spectroscopies. Oxide products contained titanium(IV) and either iron(III) or iron(II), depending on reaction conditions and thermal treatment temperatures. An interesting iron(III)→iron(II) reduction was observed at 1000 ºC in the HNO3-containing system, leading to the detection of ilmenite (FeTiO3). SEM/EDS studies revealed a highly heterogeneous metal distribution in all products, possibly related to the presence of a significant content of carbon and of structural defects (oxygen vacancies) in the solids.
Resumo:
In this work, the modifications promoted by alkaline hydrolysis and glutaraldehyde (GA) crosslinking on type I collagen found in porcine skin have been studied. Collagen matrices were obtained from the alkaline hydrolysis of porcine skin, with subsequent GA crosslinking in different concentrations and reaction times. The elastin content determination showed that independent of the treatment, elastin was present in the matrices. Results obtained from in vitro trypsin degradation indicated that with the increase of GA concentration and reaction time, the degradation rate decreased. From thermogravimetry and differential scanning calorimetry analysis it can be observed that the collagen in the matrices becomes more resistant to thermal degradation as a consequence of the increasing crosslink degree. Scanning electron microscopy analysis indicated that after the GA crosslinking, collagen fibers become more organized and well-defined. Therefore, the preparations of porcine skin matrices with different degradation rates, which can be used in soft tissue reconstruction, are viable.
Resumo:
The present work has aimed to determine the 16 US EPA priority PAH atmospheric particulate matter levels present in three sites around Salvador, Bahia: (i) Lapa bus station, strongly impacted by heavy-duty diesel vehicles; (ii) Aratu harbor, impacted by an intense movement of goods, and (iii) Bananeira village on Maré Island, a non vehicle-influenced site with activities such as handcraft work and fisheries. Results indicated that BbF (0.130-6.85 ng m-3) is the PAH with highest concentration in samples from Aratu harbor and Bananeira and CRY (0.075-6.85 ng m-3) presented higher concentrations at Lapa station. PAH sources from studied sites were mainly of anthropogenic origin such as gasoline-fueled light-duty vehicles and diesel-fueled heavy-duty vehicles, discharges in the port, diesel burning from ships, dust ressuspension, indoor soot from cooking, and coal and wood combustion for energy production.