In recent years, acoustical analysis of the swallowing mechanism has received considerable attention due to its diagnostic potentials. This paper presents a hidden Markov model (HMM) based method for the swallowing sound segmentation and classification. Swallowing sound signals of 15 healthy and 11 dysphagic subjects were studied. The signals were divided into sequences of 25 ms segments each of which were represented by seven features. The sequences of features were modeled by HMMs. Trained HMMs were used for segmentation of the swallowing sounds into three distinct phases, i.e., initial quiet period, initial discrete sounds (IDS) and bolus transit sounds (BTS). Among the seven features, accuracy of segmentation by the HMM based on multi-scale product of wavelet coefficients was higher than that of the other HMMs and the linear prediction coefficient (LPC)-based HMM showed the weakest performance. In addition, HMMs were used for classification of the swallowing sounds of healthy subjects and dysphagic patients. Classification accuracy of different HMM configurations was investigated. When we increased the number of states of the HMMs from 4 to 8, the classification error gradually decreased. In most cases, classification error for N = 9 was higher than that of N = 8. Among the seven features used, root mean square (RMS) and waveform fractal dimension (WFD) showed the best performance in the HMM-based classification of swallowing sounds. When the sequences of the features of IDS segment were modeled separately, the accuracy reached up to 85.5%. As a second stage classification, a screening algorithm was used which correctly classified all the subjects but one healthy subject when RMS was used as characteristic feature of the swallowing sounds and the number of states was set to N = 8.
Automated detection of swallowing sounds in swallowing and breath sound recordings is of importance for monitoring purposes in which the recording durations are long. This paper presents a novel method for swallowing sound detection using hidden Markov modeling of recurrence plot features. Tracheal sound recordings of 15 healthy and nine dysphagic subjects were studied. The multidimensional state space trajectory of each signal was reconstructed using the Taken method of delays. The sequences of three recurrence plot features of the reconstructed trajectories (which have shown discriminating capability between swallowing and breath sounds) were modeled by three hidden Markov models. The Viterbi algorithm was used for swallowing sound detection. The results were validated manually by inspection of the simultaneously recorded airflow signal and spectrogram of the sounds, and also by auditory means. The experimental results suggested that the performance of the proposed method using hidden Markov modeling of recurrence plot features was superior to the previous swallowing sound detection methods.
The goal of this study was to develop an automated and objective method to separate swallowing sounds from breath sounds. Swallowing sound detection can be utilized as part of a system for swallowing mechanism assessment and diagnosis of swallowing dysfunction (dysphagia) by acoustical means. In this study, an algorithm based on multilayer feed forward neural networks is proposed for decomposition of tracheal sound into swallowing and respiratory segments. Among many features examined, root-mean-square of the signal, the average power of the signal over 150-450 Hz and waveform fractal dimension were selected features applied to the neural network as inputs. Findings from previous studies about temporal and durational patterns of swallowing and respiration were used in a smart algorithm for further identification of the swallow and breath segments. The proposed method was applied to 18 tracheal sound recordings of 7 healthy subjects (ages 13-30 years, 4 males). The results were validated manually by visual inspection using airflow measurement and spectrogram of the sounds and auditory means. The algorithm was able to detect 91.7% of swallows correctly. The average of missed swallows and average of false detection were 8.3% and 9.5%, respectively. With additional preprocessing and post processing, the proposed method may be used for automated extraction of swallowing sounds from breath sounds in healthy and dysphagic individuals.
This paper presents an automated and objective method for extraction of swallowing sounds in a record of the tracheal breath and swallowing sounds. The proposed method takes advantage of the fact that swallowing sounds have more non-stationarity comparing with breath sounds and have large components in many wavelet scales whereas wavelet transform coefficients of breath sounds in higher wavelet scales are small. Therefore, a wavelet transform based filter was utilized in which a multiresolution decomposition-reconstruction process filters the signal. Swallowing sounds are detected in the filtered signal. The proposed method was applied to the tracheal sound recordings of 15 healthy and 11 dysphagic subjects. The results were validated manually by visual inspection using airflow measurement and spectrogram of the sounds and auditory means. Experimental results prove that the proposed method is more accurate, efficient, and objective than the methods proposed previously. Swallowing sound detection may be employed in a system for automated swallowing assessment and diagnosis of swallowing disorders (dysphagia) by acoustical means.
Several metric tools for quantative analysis of scalar time series have been developed using the theory of nonlinear dynamics. The goal of this work was to study the characteristics of swallowing sound using these metric tools. Takens method of delays was used to reconstruct multidimensional state space representation of the swallowing sounds of 6 healthy subjects (ages 13-30 years, 3 males) being fed thin and thick liquid textures. The optimum time delay for different subjects varied from 3 to 9 samples. False nearest neighbors method was used to obtain proper embedding dimension. The correlation dimension was calculated based on Grassberger-Procaccia algorithm. The results suggest that swallowing sound is well characterized by a small number of dimensions. The largest Lyapunov exponent was also estimated to evaluate the presence of chaos. As the largest Lyapunov exponent for some cases was negative, it may be concluded that swallowing sound is not necessarily a chaotic process.
This paper presents an objective method for analysis of temporal pattern of swallowing mechanism based on analysis of swallowing sounds and submental surface electromyogram (EMG). In this study, swallowing sound signal and submental EMG of 12 healthy subjects were recorded. Swallowing sound signals were divided into 25 millisecond segments each of which was represented by waveform fractal dimension (WED). Temporal pattern of swallowing sound signal was identified based on hidden Markov modeling (HMM) of the WED sequences. Submental muscle contraction was marked by thresholding the RMS values of the EMG signals. Duration of the swallowing sound phases, duration of the submental muscle contraction, and time difference between the onset of submental muscle contraction and the opening of cricopharynx were calculated. Experimental results suggest that the proposed method is efficient in the study of temporal pattern of swallowing mechanism and can provide an objective and accurate approach for swallowing mechanism analysis.
The paper presents a quantitative analysis of swallowing sounds in normal and dysphagic subjects based on nonlinear dynamics metric tools. In addition, an automated method is proposed to identify patients at risk of dysphagia. Multidimensional phase space representation of the swallowing sound was reconstructed using Takens method of delays. Rosenstein and false nearest neighbor (FNN) methods were employed to evaluate the optimum time delay and proper embedding dimension, respectively. A Grassberger-Procaccia algorithm was utilized to calculate the correlation dimension as a measure of the complexity of the reconstructed attractor. The analysis demonstrated the low-dimensional dynamic characteristics of normal and dysphagic swallowing sounds. The optimum time delay and correlation dimension of the opening and transmission phases of swallowing sounds were used as features for a 3-nearest neighbor classifier to identify individuals at risk of dysphagia. The method was applied to tracheal sound recordings of 15 healthy subjects and 11 patients with some degree of dysphagia. The algorithm was able to classify 83% of swallows correctly. Finally, a screening algorithm was used which correctly classified 24 out of 26 subjects. This study suggests that nonlinear analysis is a promising tool for quantitative analysis of swallowing sounds and swallowing disorders.
Digital image watermarking techniques have been proposed to prevent unauthorized distribution of multimedia data. A digital watermark encodes the owner's license information and embeds it into the image. Several discrete wavelet transform (DWT) based techniques are used for watermarking. In this paper, a watermarking scheme is proposed in which the image is decomposed into wavelet coefficients and a visual recognizable logo is embedded in the wavelet coefficients. Wavelet coefficients corresponding to the points located in a neighborhood that have maximum entropy are proposed for embedding the watermark. This method embeds the maximum amount of watermark while the watermark is imperceptible. Watermarking techniques must be robust to some attacks such as smoothing, sharpening and compression. These maximum entropy areas can survive a variety of attacks and can be used as reference points for watermark embedding. The experimental results confirmed that the technique is robust to a variety of attacks.
In this study, average power of tracheal sound (P ave ) was used to estimate flow by parametric method as well as adaptive filters as a nonparametric method. Based on some preliminary studies, an exponential model was used for describing the relationship between flow and P ave for parametric method. It was assumed that flow signal of at least one breath from each target flow is available for calibration. The error for flow estimation with parametric method, was found to be 9plusmn3% and 10plusmn4% for inspiration and expiration, respectively. Considering nonparametric method, the estimation error was the least for the third order adaptive filter using the average power of the tracheal sound (dB), which was 10plusmn3 % and 11plusmn4 % for inspiration and expiration, respectively.