13 resultados para audio-visual automatic speech recognition
em Scielo Saúde Pública - SP
Resumo:
The present report describes the development of a technique for automatic wheezing recognition in digitally recorded lung sounds. This method is based on the extraction and processing of spectral information from the respiratory cycle and the use of these data for user feedback and automatic recognition. The respiratory cycle is first pre-processed, in order to normalize its spectral information, and its spectrogram is then computed. After this procedure, the spectrogram image is processed by a two-dimensional convolution filter and a half-threshold in order to increase the contrast and isolate its highest amplitude components, respectively. Thus, in order to generate more compressed data to automatic recognition, the spectral projection from the processed spectrogram is computed and stored as an array. The higher magnitude values of the array and its respective spectral values are then located and used as inputs to a multi-layer perceptron artificial neural network, which results an automatic indication about the presence of wheezes. For validation of the methodology, lung sounds recorded from three different repositories were used. The results show that the proposed technique achieves 84.82% accuracy in the detection of wheezing for an isolated respiratory cycle and 92.86% accuracy for the detection of wheezes when detection is carried out using groups of respiratory cycles obtained from the same person. Also, the system presents the original recorded sound and the post-processed spectrogram image for the user to draw his own conclusions from the data.
Resumo:
The teaching apprenticeship established by CAPES for post-graduation scholarship beholders has been discussed and the criterion adopted for the implementation in the post-graduation in Inorganic Chemistry Program presented. A teaching plan for the new subject is proposed, based on the experience gained through a first group. An instrument for evaluation of the student's performance has been developed and analyzed. Aspects like knowledge, clearness, enthusiasm, confidence, good manage on the audio-visual resources, class length are mentioned by degree of importance and the major difficulties faced and pointed out by the students.
Resumo:
Previous assessment of verticality by means of rod and rod and frame tests indicated that human subjects can be more (field dependent) or less (field independent) influenced by a frame placed around a tilted rod. In the present study we propose a new approach to these tests. The judgment of visual verticality (rod test) was evaluated in 50 young subjects (28 males, ranging in age from 20 to 27 years) by randomly projecting a luminous rod tilted between -18 and +18° (negative values indicating left tilts) onto a tangent screen. In the rod and frame test the rod was displayed within a luminous fixed frame tilted at +18 or -18°. Subjects were instructed to verbally indicate the rod’s inclination direction (forced choice). Visual dependency was estimated by means of a Visual Index calculated from rod and rod and frame test values. Based on this index, volunteers were classified as field dependent, intermediate and field independent. A fourth category was created within the field-independent subjects for whom the amount of correct guesses in the rod and frame test exceeded that of the rod test, thus indicating improved performance when a surrounding frame was present. In conclusion, the combined use of subjective visual vertical and the rod and frame test provides a specific and reliable form of evaluation of verticality in healthy subjects and might be of use to probe changes in brain function after central or peripheral lesions.
Resumo:
O tratamento do câncer infantil provoca diversos efeitos colaterais, como a ototoxicidade, que é capaz de lesar estruturas da orelha interna e pode levar à perda auditiva. OBJETIVO: Estimar a prevalência de perda auditiva em crianças e adolescentes com câncer, utilizando três classificações: American Speech-Language-Hearing Association (ASHA), Pediatric Oncology Group Toxicity (POGT) e Perda Auditiva Bilateral (PAB). Forma de Estudo: Transversal. MATERIAL E MÉTODO: Analisou-se 94 pacientes atendidos entre 2003 e 2004. Os indivíduos foram submetidos à inspeção visual do meato acústico externo e avaliação audiológica. Para caracterização da amostra utilizou-se a estatística descritiva e para a análise da concordância da perda auditiva nas três classificações foi utilizada a estatística Kappa. RESULTADOS: Houve prevalência de perda auditiva de 42,5% pela ASHA, 40,4% pela POGT e 12,8% pela PAB. A concordância para POGT e PAB, e para PAB e ASHA foi fraca (respectivamente, k=0,36 e k=0,33). A concordância entre ASHA e POGT foi quase perfeita (k=0,96). CONCLUSÕES: A perda de audição é um efeito colateral importante nos pacientes com câncer. A monitorização auditiva é fundamental, pois possibilita detecção precoce e revisão do tratamento. Recomenda-se adotar uma classificação que contemple perdas auditivas leves, como proposta pela ASHA.
Resumo:
OBJETIVO: Investigar o desenvolvimento da linguagem e das funções auditiva e visual em lactentes de creche, a partir da avaliação realizada por educadores. MÉTODOS: Foram avaliados 115 lactentes, nos anos de 1998 a 2001, usuários de uma creche da área da saúde de uma universidade do Estado de São Paulo. Foi utilizado o "Protocolo da Observação do Desenvolvimento de Linguagem e das Funções Auditiva e Visual", com 39 provas no total, para a avaliação dos lactentes de 3 até 12 meses de idade. A aplicação desse Protocolo foi feita pelas educadoras da creche, devidamente treinadas. Utilizou-se o teste de Qui-quadrado ou Exato de Fisher. O nível de significância adotado foi de 5%. RESULTADOS: Os lactentes apresentaram um padrão diferente no desenvolvimento da linguagem quanto ao início do balbucio e das primeiras palavras, bem como na função visual, quanto à imitação e uso de jogos gestuais e de seguir ordem com uso de gestos. CONCLUSÕES: O ambiente creche propicia condições para um outro padrão de desenvolvimento de linguagem e das funções auditiva e visual. Ações de prevenção na creche devem integrar as áreas de saúde e educação num objetivo comum.
Resumo:
The early facilitatory effect of a peripheral spatially visual prime stimulus described in the literature for simple reaction time tasks has been usually smaller than that described for complex (go/no-go, choice) reaction time tasks. In the present study we investigated the reason for this difference. In a first and a second experiment we tested the participants in both a simple task and a go/no-go task, half of them beginning with one of these tasks and half with the other one. We observed that the prime stimulus had an early effect, inhibitory for the simple task and facilitatory for the go/no-go task, when the task was performed first. No early effect appeared when the task was performed second. In a third and a fourth experiment the participants were, respectively, tested in the simple task and in the go/no-go task for four sessions (the prime stimulus was presented in the second, third and fourth sessions). The early effects of the prime stimulus did not change across the sessions, suggesting that a habituatory process was not the cause for the disappearance of these effects in the first two experiments. Our findings are compatible with the idea that different attentional strategies are adopted in simple and complex reaction time tasks. In the former tasks the gain of automatic attention mechanisms may be adjusted to a low level and in the latter tasks, to a high level. The attentional influence of the prime stimulus may be antagonized by another influence, possibly a masking one.
Resumo:
Several methods are used to estimate anaerobic threshold (AT) during exercise. The aim of the present study was to compare AT obtained by a graphic visual method for the estimate of ventilatory and metabolic variables (gold standard), to a bi-segmental linear regression mathematical model of Hinkley's algorithm applied to heart rate (HR) and carbon dioxide output (VCO2) data. Thirteen young (24 ± 2.63 years old) and 16 postmenopausal (57 ± 4.79 years old) healthy and sedentary women were submitted to a continuous ergospirometric incremental test on an electromagnetic braking cycloergometer with 10 to 20 W/min increases until physical exhaustion. The ventilatory variables were recorded breath-to-breath and HR was obtained beat-to-beat over real time. Data were analyzed by the nonparametric Friedman test and Spearman correlation test with the level of significance set at 5%. Power output (W), HR (bpm), oxygen uptake (VO2; mL kg-1 min-1), VO2 (mL/min), VCO2 (mL/min), and minute ventilation (VE; L/min) data observed at the AT level were similar for both methods and groups studied (P > 0.05). The VO2 (mL kg-1 min-1) data showed significant correlation (P < 0.05) between the gold standard method and the mathematical model when applied to HR (r s = 0.75) and VCO2 (r s = 0.78) data for the subjects as a whole (N = 29). The proposed mathematical method for the detection of changes in response patterns of VCO2 and HR was adequate and promising for AT detection in young and middle-aged women, representing a semi-automatic, non-invasive and objective AT measurement.
Resumo:
A long-standing debate in the literature is whether attention can form two or more independent spatial foci in addition to the well-known unique spatial focus. There is evidence that voluntary visual attention divides in space. The possibility that this also occurs for automatic visual attention was investigated here. Thirty-six female volunteers were tested. In each trial, a prime stimulus was presented in the left or right visual hemifield. This stimulus was characterized by the blinking of a superior, middle or inferior ring, the blinking of all these rings, or the blinking of the superior and inferior rings. A target stimulus to which the volunteer should respond with the same side hand or a target stimulus to which she should not respond was presented 100 ms later in a primed location, a location between two primed locations or a location in the contralateral hemifield. Reaction time to the positive target stimulus in a primed location was consistently shorter than reaction time in the horizontally corresponding contralateral location. This attentional effect was significantly smaller or absent when the positive target stimulus appeared in the middle location after the double prime stimulus. These results suggest that automatic visual attention can focus on two separate locations simultaneously, to some extent sparing the region in between.
Resumo:
Motivated by a recently proposed biologically inspired face recognition approach, we investigated the relation between human behavior and a computational model based on Fourier-Bessel (FB) spatial patterns. We measured human recognition performance of FB filtered face images using an 8-alternative forced-choice method. Test stimuli were generated by converting the images from the spatial to the FB domain, filtering the resulting coefficients with a band-pass filter, and finally taking the inverse FB transformation of the filtered coefficients. The performance of the computational models was tested using a simulation of the psychophysical experiment. In the FB model, face images were first filtered by simulated V1- type neurons and later analyzed globally for their content of FB components. In general, there was a higher human contrast sensitivity to radially than to angularly filtered images, but both functions peaked at the 11.3-16 frequency interval. The FB-based model presented similar behavior with regard to peak position and relative sensitivity, but had a wider frequency band width and a narrower response range. The response pattern of two alternative models, based on local FB analysis and on raw luminance, strongly diverged from the human behavior patterns. These results suggest that human performance can be constrained by the type of information conveyed by polar patterns, and consequently that humans might use FB-like spatial patterns in face processing.
Resumo:
The occurrence of a weak auditory warning stimulus increases the speed of the response to a subsequent visual target stimulus that must be identified. This facilitatory effect has been attributed to the temporal expectancy automatically induced by the warning stimulus. It has not been determined whether this results from a modulation of the stimulus identification process, the response selection process or both. The present study examined these possibilities. A group of 12 young adults performed a reaction time location identification task and another group of 12 young adults performed a reaction time shape identification task. A visual target stimulus was presented 1850 to 2350 ms plus a fixed interval (50, 100, 200, 400, 800, or 1600 ms, depending on the block) after the appearance of a fixation point, on its left or right side, above or below a virtual horizontal line passing through it. In half of the trials, a weak auditory warning stimulus (S1) appeared 50, 100, 200, 400, 800, or 1600 ms (according to the block) before the target stimulus (S2). Twelve trials were run for each condition. The S1 produced a facilitatory effect for the 200, 400, 800, and 1600 ms stimulus onset asynchronies (SOA) in the case of the side stimulus-response (S-R) corresponding condition, and for the 100 and 400 ms SOA in the case of the side S-R non-corresponding condition. Since these two conditions differ mainly by their response selection requirements, it is reasonable to conclude that automatic temporal expectancy influences the response selection process.
Resumo:
The visualization of tools and manipulable objects activates motor-related areas in the cortex, facilitating possible actions toward them. This pattern of activity may underlie the phenomenon of object affordance. Some cortical motor neurons are also covertly activated during the recognition of body parts such as hands. One hypothesis is that different subpopulations of motor neurons in the frontal cortex are activated in each motor program; for example, canonical neurons in the premotor cortex are responsible for the affordance of visual objects, while mirror neurons support motor imagery triggered during handedness recognition. However, the question remains whether these subpopulations work independently. This hypothesis can be tested with a manual reaction time (MRT) task with a priming paradigm to evaluate whether the view of a manipulable object interferes with the motor imagery of the subject's hand. The MRT provides a measure of the course of information processing in the brain and allows indirect evaluation of cognitive processes. Our results suggest that canonical and mirror neurons work together to create a motor plan involving hand movements to facilitate successful object manipulation.
Resumo:
The impact of automatic and manual shelling methods during manual/visual sorting of different batches of Brazil nuts from the 2010 and 2011 harvests was evaluated in order to investigate aflatoxin prevention.The samples were tested as follows: in-shell, shell, shelled, and pieces in order to evaluate the moisture content (mc), water activity (Aw), and total aflatoxin (LOD = 0.3 µg/kg and LOQ 0.85 µg/kg) at the Brazil nut processing plant. The results of aflatoxins obtained for the manually shelled nut samples ranged from 3.0 to 60.3 µg/g and from 2.0 to 31.0 µg/g for the automatically shelled samples. All samples showed levels of mc below the limit of 15%; on the other hand, shelled samples from both harvests showed levels of Aw above the limit. There were no significant differences concerning the manual or automatic shelling results during the sorting stages. On the other hand, the visual sorting was effective in decreasing the aflatoxin contamination in both methods.