Mercurial > forge
changeset 12514:be0d1e9ab958 octave-forge
maint: moved audio to individual hg repository.
author | carandraug |
---|---|
date | Mon, 11 Aug 2014 02:06:53 +0000 |
parents | 43bdb4a84fe6 |
children | ac9b1d7fb7b2 |
files | main/audio/COPYING main/audio/DESCRIPTION main/audio/INDEX main/audio/NEWS main/audio/doc/aurecord.1 main/audio/doc/endpoint.doc main/audio/inst/au.m main/audio/inst/auload.m main/audio/inst/auplot.m main/audio/inst/ausave.m main/audio/inst/sample.wav main/audio/inst/sound.m main/audio/inst/soundsc.m main/audio/src/.svnignore main/audio/src/Makeconf.in main/audio/src/Makefile main/audio/src/Makefile.linux main/audio/src/Makefile.macosx main/audio/src/OFSndPlay.cc main/audio/src/aurecord.cc main/audio/src/autogen.sh main/audio/src/configure.base main/audio/src/endpoint.cc main/audio/src/endpoint.h |
diffstat | 24 files changed, 0 insertions(+), 2972 deletions(-) [+] |
line wrap: on
line diff
--- a/main/audio/COPYING Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,1 +0,0 @@ -See individual files for licenses
--- a/main/audio/DESCRIPTION Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,11 +0,0 @@ -Name: Audio -Version: 1.1.4 -Date: 2011-12-11 -Author: Paul Kienzle <pkienzle@users.sf.net> -Maintainer: Octave-Forge community <octave-dev@lists.sourceforge.net> -Title: Audio -Description: Audio recording, processing and playing tools. -Depends: octave (>= 2.9.7), miscellaneous (>= 1.1.0) -Autoload: no -License: GPLv3+, public domain -Url: http://octave.sf.net
--- a/main/audio/INDEX Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,11 +0,0 @@ -audio >> Audio -Record and play - sound - soundsc - aurecord aucapture -Read and write - auload ausave -Process - auplot - au -
--- a/main/audio/NEWS Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,10 +0,0 @@ -Summary of important user-visible changes for releases of the audio package - -=============================================================================== -audio-X.Y.Z Release Date: 2011-YY-XX Release Manager: -=============================================================================== - - ** The function `clip' was been moved to the miscellaneous package. For - this, the audio package is now dependent on miscellaneous (>= 1.1.0). - - ** Package is no longer automatically loaded.
--- a/main/audio/doc/aurecord.1 Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,53 +0,0 @@ -.\" Man page added by Dirk Eddelbuettel <edd@debian.org> -.TH AURECORD 1 "Debian/GNU Linux" -.SH NAME -aurecord \- record audio signals at specified rate -.SH SYNOPSIS -.B aurecord -r rate -c channels -t time -e -.SH DESCRIPTION -.BR aurecord (1) -is an internal command to the -.B signalPAK -routines and is not intended for direct command-line use. - -.BR aurecord (1) -records -.I time -seconds of -.I channels -channel audio at -.IR rate . -If the -.I -e -option is specified, recording doesn't begin until there is a signal -to record, and stops when the signal ends, so it is great for -capturing a bit of speech without having to remove the surrounding -silence. - -The data is written to stdout as: -.br - 4 byte rate (machine format) -.br - 4 byte number of channels (machine format) -.br - 2 byte channel1 channel2 channel3 ... (machine format) -.br - 2 byte channel1 channel2 channel3 ... (machine format) -.br - ... -.br - 2 byte channel1 channel2 channel3 ... (machine format) -.SH AUTHOR -.B aurecord -is part of -.BR signalPAK , -a collection of signal-processing routines for Octave, written by -Paul Kienzle. See -.BR http://users.powernet.co.uk/kienzle/signalPAK/ . - - - - - - -
--- a/main/audio/doc/endpoint.doc Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,124 +0,0 @@ -Interactive speech recognition systems are only useful if they can -can run with live input. The problem with live input, as opposed to -pre-recorded data, is that the exact start and end of the utterance -is unknown. One technique to deal with this problem, is to record -a fixed size utterance (e.g., 5 seconds) and assume that the user will -speak the entire utterance within the time period. A recognizer which -has silence models for the start and end of the utterance can thus -parse such an utterance. However, such a scheme is obviously prone -to errors and is computationally wasteful because the entire input -buffer must be searched. - -The obvious solution is an endpointer which identifies the start and -end of utterance. The problem is that endpointing an utterance, like -speech recognition itself, is non-trivial. - -This is an endpointing algorithm designed for real-time input of -a speech signal. "Real-time" means that the signal is processed -in parallel with its recording. This allows a speech recognition -system to run in parallel with the input of the utterance. - -This algorithm calculates and uses "cheap" parameters, RMS energy and -zero crossing counts. Thus, this algorithm can run in real-time on -any micro processor without the need for a DSP. - -Because the signal is end-pointed in real-time, errors can and do -occur in identifying the start and end of the actual utterance. -Thus, the labels, or tags, that this endpointer gives for each -frame of data are some what "fuzzy". That is, the endpointer will -tentitively label a frame but may indicate at a later frame that -the identification of a previous frame was in error. This requires -special handling by the speech recognition system in that it must -be capable of re-starting recognition after false starts and continuing -searching after possible end of utterance frames. - -The endpointer works by passing to it one frame of data at a time. The -endpointer will check the frame to determine if it is part of the -utterance and return a label, or tag, for the frame. The possible -labels are the following: - - EP_NONE - EP_NOSTARTSILENCE - EP_SILENCE - EP_SIGNAL - EP_RESET - EP_MAYBEEND - EP_NOTEND - EP_ENDOFUTT - -EP_NONE - This is a NULL label which the endpointer does not return, -This is convenient to have for labeling frames for which the endpointer -is turned off. - -EP_NOSTARTSILENCE - The first frame is so loud or noisy that it does not -"look" like background silence. This depends on absolute thresholds and -can generate a false positive for really noisy signals or a false negative -for really quiet signals. See theory of operation below. - -EP_SILENCE - This label is returned for silence frames before the start -of the utterance. - -EP_SIGNAL - This is returned for each frame that appears to be contained -in the utterance signal. The first E_SIGNAL frame marks the start of the -utterance. - -EP_RESET - This indicates a false start condition. The previous EP_SIGNAL -frames were, in fact, not part of the utterance. The recognition system -should reset itself and start over. - -EP_MAYBEEND - This label indicates the possible end of utterance. The -frame which has this label is actually one frame after the possible last -frame of the utterance. As this is a tentative label, the recognition -system should either do end of utterance processing or save its state at -this point for end of utterance processing. In either case, the recognition -system must continuing searching, including this frame, until the end of -utterance has been confirmed. - -EP_NOTEND - The previous EP_MAYBEEND label was wrong. The utterance is -continuing. The recognition can now forget its possible end of utterance -state. - -EP_ENDOFUTT - The label confirms the actual end of utterance. The real -end of utterance was the last EP_SIGNAL frame before the last EP_MAYBEEND -labeled frame. - - -Theory of operation: -For each frame of data, the endpointer calculates the RMS energy and the -zero-cross count. The first few frames are assumed to be background -silence and are used to initialize various thresholds. If there is no -starting silence (the user speaks too soon), then the endpointer will -mislabel the first syllable (which may be one or more words) until a -silence is reached. Similarly, if there is no ending silence, then the -endpointer will not mark the end of utterance. - -A running average of the background silence is kept which consists of -averaging the last few silence frames. This background silence is used -to set energy thresholds and the Schmidt trigger for the zero-cross -counter. - -The endpointer contains over a dozen thresholds and settings which are used -to determine frication, voicing, and silence. These thresholds have been -determined emperically. - -The sampling rate, window size in samples, and the step size in samples -are passed to the class constructor. These three arguments are used to -calculate the internal thresholds (actual zero-cross count values for -frequencies and number of frames for durations). Any or all of the internal - -CAVEATS: -The endpointer will fail if there is no starting silence or endsilence. -If there is no starting silence, then the first syllable up to the first -stop consonant will be lost. If there is no ending silence, then the last -syllable will the lost or no end of utterance will be determined. -thresholds can be changed by specifying them in the class constructor. - -The endpointer makes no distinction between noise and speech. Impulse -noises will fool it. The endpointer tends to be conservative in that it -will err by including noises with the signal rather than cutting out part -of the actual speech signal. So, a good recognition system must model -noise. - -Large amplitude background white noise may cause the endpointer to miss -fricatives, weak or strong. If the background noise is known a priori, then -the endpointer thresholds can be adjusted to cope with the noise.
--- a/main/audio/inst/au.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,51 +0,0 @@ -## Copyright (C) 2000 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## y = au(x, fs, lo [, hi]) -## -## Extract data from x for time range lo to hi in milliseconds. If lo -## is [], start at the beginning. If hi is [], go to the end. If hi is -## not specified, return the single element at lo. If lo<0, prepad the -## signal to time lo. If hi is beyond the end, postpad the signal to -## time hi. - -## TODO: modify prepad and postpad so that they accept matrices. -function y=au(x,fs,lo,hi) - if nargin<3 || nargin>4, - usage("y = au(x, fs, lo [,hi])"); - endif - - if nargin<4, hi=lo; endif - if isempty(lo), - lo=1; - else - lo=fix(lo*fs/1000)+1; - endif - if isempty(hi), - hi=length(x); - else - hi=fix(hi*fs/1000)+1; - endif - if hi<lo, t=hi; hi=lo; lo=hi; endif - if (size(x,1)==1 || size(x,2)==1) - y=x(max(lo,1):min(hi,length(x))); - if (lo<1), y=prepad(y,length(y)-lo+1); endif - if (hi>length(x)), y=postpad(y,length(y)+hi-length(x)); endif - else - y=x(max(lo,1):min(hi,length(x)), :); - if (lo<1), y=[zeros(size(x,2),-lo+1) ; y]; endif - if (hi>length(x)), y=[y ; zeros(size(x,2),hi-length(x))]; endif - endif -endfunction
--- a/main/audio/inst/auload.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,383 +0,0 @@ -## Copyright (C) 1999 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## -*- texinfo -*- -## @deftypefn {Function File} {[@var{x},@var{fs},@var{sampleformat}] =} auload (@var{filename}) -## -## Reads an audio waveform from a file given by the string @var{filename}. -## Returns the audio samples in data, one column per channel, one row per -## time slice. Also returns the sample rate and stored format (one of ulaw, -## alaw, char, int16, int24, int32, float, double). The sample value will be -## normalized to the range [-1,1] regardless of the stored format. -## -## @example -## [x, fs] = auload(file_in_loadpath("sample.wav")); -## auplot(x,fs); -## @end example -## -## Note that translating the asymmetric range [-2^n,2^n-1] into the -## symmetric range [-1,1] requires a DC offset of 2/2^n. The inverse -## process used by ausave requires a DC offset of -2/2^n, so loading and -## saving a file will not change the contents. Other applications may -## compensate for the asymmetry in a different way (including previous -## versions of auload/ausave) so you may find small differences in -## calculated DC offsets for the same file. -## @end deftypefn - -## 2001-09-04 Paul Kienzle <pkienzle@users.sf.net> -## * skip unknown blocks in WAVE format. -## 2001-09-05 Paul Kienzle <pkienzle@users.sf.net> -## * remove debugging stuff from AIFF format. -## * use data length if it is given rather than reading to the end of file. -## 2001-12-11 Paul Kienzle <pkienzle@users.sf.net> -## * use closed interval [-1,1] rather than open interval [-1,1) internally - -function [data, rate, sampleformat] = auload(path) - - if (nargin != 1) - usage("[x, fs, sampleformat] = auload('filename.ext')"); - end - data = []; # if error then read nothing - rate = 8000; - sampleformat = 'ulaw'; - ext = rindex(path, '.'); - if (ext == 0) - usage('x = auload(filename.ext)'); - end - ext = tolower(substr(path, ext+1, length(path)-ext)); - - [file, msg] = fopen(path, 'rb'); - if (file == -1) - error([ msg, ": ", path]); - end - - msg = sprintf('Invalid audio header: %s', path); - ## Microsoft .wav format - if strcmp(ext,'wav') - - ## Header format obtained from sox/wav.c - ## April 15, 1992 - ## Copyright 1992 Rick Richardson - ## Copyright 1991 Lance Norskog And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Lance Norskog And Sundry Contributors are not responsible for - ## the consequences of using this software. - - ## check the file magic header bytes - arch = 'ieee-le'; - str = char(fread(file, 4, 'char')'); - if !strcmp(str, 'RIFF') - error(msg); - end - len = fread(file, 1, 'int32', 0, arch); - str = char(fread(file, 4, 'char')'); - if !strcmp(str, 'WAVE') - error(msg); - end - - ## skip to the "fmt " section, ignoring everything else - while (1) - if feof(file) - error(msg); - end - str = char(fread(file, 4, 'char')'); - len = fread(file, 1, 'int32', 0, arch); - if strcmp(str, 'fmt ') - break; - end - fseek(file, len, SEEK_CUR); - end - - ## read the "fmt " section - formatid = fread(file, 1, 'int16', 0, arch); - channels = fread(file, 1, 'int16', 0, arch); - rate = fread(file, 1, 'int32', 0, arch); - fread(file, 1, 'int32', 0, arch); - fread(file, 1, 'int16', 0, arch); - bits = fread(file, 1, 'int16', 0, arch); - fseek(file, len-16, SEEK_CUR); - - ## skip to the "data" section, ignoring everything else - while (1) - if feof(file) - error(msg); - end - str = char(fread(file, 4, 'char')'); - len = fread(file, 1, 'int32', 0, arch); - if strcmp(str, 'data') - break; - end - fseek(file, len, SEEK_CUR); - end - - if (formatid == 1) - if bits == 8 - sampleformat = 'uchar'; - precision = 'uchar'; - samples = len; - elseif bits == 16 - sampleformat = 'int16'; - precision = 'int16'; - samples = len/2; - elseif bits == 24 - sampleformat = 'int24'; - precision = 'int24'; - samples = len/3; - elseif bits == 32 - sampleformat = 'int32'; - precision = 'int32'; - samples = len/4; - else - error(msg); - endif - elseif (formatid == 3) - if bits == 32 - sampleformat = 'float'; - precision = 'float'; - samples = len/4; - elseif bits == 64 - sampleformat = 'double'; - precision = 'double'; - samples = len/8; - else - error(msg); - endif - elseif (formatid == 6 && bits == 8) - sampleformat = 'alaw'; - precision = 'uchar'; - samples = len; - elseif (formatid == 7 && bits == 8) - sampleformat = 'ulaw'; - precision = 'uchar'; - samples = len; - else - error(msg); - return; - endif - - ## Sun .au format - elseif strcmp(ext, 'au') - - ## Header format obtained from sox/au.c - ## September 25, 1991 - ## Copyright 1991 Guido van Rossum And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Guido van Rossum And Sundry Contributors are not responsible for - ## the consequences of using this software. - - str = char(fread(file, 4, 'char')'); - magic=' ds.'; - invmagic='ds. '; - magic(1) = char(0); - invmagic(1) = char(0); - if strcmp(str, 'dns.') || strcmp(str, magic) - arch = 'ieee-le'; - elseif strcmp(str, '.snd') || strcmp(str, invmagic) - arch = 'ieee-be'; - else - error(msg); - end - header = fread(file, 1, 'int32', 0, 'ieee-be'); - len = fread(file, 1, 'int32', 0, 'ieee-be'); - formatid = fread(file, 1, 'int32', 0, 'ieee-be'); - rate = fread(file, 1, 'int32', 0, 'ieee-be'); - channels = fread(file, 1, 'int32', 0, 'ieee-be'); - fseek(file, header-24, SEEK_CUR); % skip file comment - - ## interpret the sample format - if formatid == 1 - sampleformat = 'ulaw'; - precision = 'uchar'; - bits = 12; - samples = len; - elseif formatid == 2 - sampleformat = 'uchar'; - precision = 'uchar'; - bits = 8; - samples = len; - elseif formatid == 3 - sampleformat = 'int16'; - precision = 'int16'; - bits = 16; - samples = len/2; - elseif formatid == 5 - sampleformat = 'int32'; - precision = 'int32'; - bits = 32; - samples = len/4; - elseif formatid == 6 - sampleformat = 'float'; - precision = 'float'; - bits = 32; - samples = len/4; - elseif formatid == 7 - sampleformat = 'double'; - precision = 'double'; - bits = 64; - samples = len/8; - else - error(msg); - end - - ## Apple/SGI .aiff format - elseif strcmp(ext,'aiff') || strcmp(ext,'aif') - - ## Header format obtained from sox/aiff.c - ## September 25, 1991 - ## Copyright 1991 Guido van Rossum And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Guido van Rossum And Sundry Contributors are not responsible for - ## the consequences of using this software. - ## - ## IEEE 80-bit float I/O taken from - ## ftp://ftp.mathworks.com/pub/contrib/signal/osprey.tar - ## David K. Mellinger - ## dave@mbari.org - ## +1-831-775-1805 - ## fax -1620 - ## Monterey Bay Aquarium Research Institute - ## 7700 Sandholdt Road - - ## check the file magic header bytes - arch = 'ieee-be'; - str = char(fread(file, 4, 'char')'); - if !strcmp(str, 'FORM') - error(msg); - end - len = fread(file, 1, 'int32', 0, arch); - str = char(fread(file, 4, 'char')'); - if !strcmp(str, 'AIFF') - error(msg); - end - - ## skip to the "COMM" section, ignoring everything else - while (1) - if feof(file) - error(msg); - end - str = char(fread(file, 4, 'char')'); - len = fread(file, 1, 'int32', 0, arch); - if strcmp(str, 'COMM') - break; - end - fseek(file, len, SEEK_CUR); - end - - ## read the "COMM" section - channels = fread(file, 1, 'int16', 0, arch); - frames = fread(file, 1, 'int32', 0, arch); - bits = fread(file, 1, 'int16', 0, arch); - exp = fread(file, 1, 'uint16', 0, arch); % read a 10-byte float - mant = fread(file, 2, 'uint32', 0, arch); - mant = mant(1) / 2^31 + mant(2) / 2^63; - if (exp >= 32768), mant = -mant; exp = exp - 32768; end - exp = exp - 16383; - rate = mant * 2^exp; - fseek(file, len-18, SEEK_CUR); - - ## skip to the "SSND" section, ignoring everything else - while (1) - if feof(file) - error(msg); - end - str = char(fread(file, 4, 'char')'); - len = fread(file, 1, 'int32', 0, arch); - if strcmp(str, 'SSND') - break; - end - fseek(file, len, SEEK_CUR); - end - offset = fread(file, 1, 'int32', 0, arch); - fread(file, 1, 'int32', 0, arch); - fseek(file, offset, SEEK_CUR); - - if bits == 8 - precision = 'uchar'; - sampleformat = 'uchar'; - samples = len - 8; - elseif bits == 16 - precision = 'int16'; - sampleformat = 'int16'; - samples = (len - 8)/2; - elseif bits == 32 - precision = 'int32'; - sampleformat = 'int32'; - samples = (len - 8)/4; - else - error(msg); - endif - - ## file extension unknown - else - error('auload(filename.ext) understands .wav .au and .aiff only'); - end - - ## suck in all the samples - if (samples <= 0) samples = Inf; end - if (precision == 'int24') - data = fread(file, 3*samples, 'uint8', 0, arch); - if (arch == 'ieee-le') - data = data(1:3:end) + data(2:3:end) * 2^8 + cast(typecast(cast(data(3:3:end), 'uint8'), 'int8'), 'double') * 2^16; - else - data = data(3:3:end) + data(2:3:end) * 2^8 + cast(typecast(cast(data(1:3:end), 'uint8'), 'int8'), 'double') * 2^16; - endif - else - data = fread(file, samples, precision, 0, arch); - endif - fclose(file); - - ## convert samples into range [-1, 1) - if strcmp(sampleformat, 'alaw') - alaw = [ ... - -5504, -5248, -6016, -5760, -4480, -4224, -4992, -4736, ... - -7552, -7296, -8064, -7808, -6528, -6272, -7040, -6784, ... - -2752, -2624, -3008, -2880, -2240, -2112, -2496, -2368, ... - -3776, -3648, -4032, -3904, -3264, -3136, -3520, -3392, ... - -22016, -20992, -24064, -23040, -17920, -16896, -19968, -18944, ... - -30208, -29184, -32256, -31232, -26112, -25088, -28160, -27136, ... - -11008, -10496, -12032, -11520, -8960, -8448, -9984, -9472, ... - -15104, -14592, -16128, -15616, -13056, -12544, -14080, -13568, ... - -344, -328, -376, -360, -280, -264, -312, -296, ... - -472, -456, -504, -488, -408, -392, -440, -424, ... - -88, -72, -120, -104, -24, -8, -56, -40, ... - -216, -200, -248, -232, -152, -136, -184, -168, ... - -1376, -1312, -1504, -1440, -1120, -1056, -1248, -1184, ... - -1888, -1824, -2016, -1952, -1632, -1568, -1760, -1696, ... - -688, -656, -752, -720, -560, -528, -624, -592, ... - -944, -912, -1008, -976, -816, -784, -880, -848 ]; - alaw = ([ alaw,-alaw]+0.5)/32767.5; - data = alaw(data+1); - elseif strcmp(sampleformat, 'ulaw') - data = mu2lin(data, 0); - elseif strcmp(sampleformat, 'uchar') - ## [ 0, 255 ] -> [ -1, 1 ] - data = data/127.5 - 1; - elseif strcmp(sampleformat, 'int16') - ## [ -32768, 32767 ] -> [ -1, 1 ] - data = (data+0.5)/32767.5; - elseif strcmp(sampleformat, 'int32') - ## [ -2^31, 2^31-1 ] -> [ -1, 1 ] - data = (data+0.5)/(2^31-0.5); - end - data = reshape(data, channels, length(data)/channels)'; - -endfunction - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! auplot(x,fs);
--- a/main/audio/inst/auplot.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,197 +0,0 @@ -## Copyright (C) 1999 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## -*- texinfo -*- -## @deftypefn {Function File} {[@var{y},@var{t},@var{scale}] = } auplot (@var{x}) -## @deftypefnx {Function File} {[@var{y},@var{t},@var{scale}] = } auplot (@var{x},@var{fs}) -## @deftypefnx {Function File} {[@var{y},@var{t},@var{scale}] = } auplot (@var{x},@var{fs},@var{offset}) -## @deftypefnx {Function File} {[@var{y},@var{t},@var{scale}] = } auplot (@var{...},@var{plotstr}) -## -## Plot the waveform data, displaying time on the @var{x} axis. If you are -## plotting a slice from the middle of an array, you may want to specify -## the @var{offset} into the array to retain the appropriate time index. If -## the waveform contains multiple channels, then the data are scaled to -## the range [-1,1] and shifted so that they do not overlap. If a @var{plotstr} -## is given, it is passed as the third argument to the plot command. This -## allows you to set the linestyle easily. @var{fs} defaults to 8000 Hz, and -## @var{offset} defaults to 0 samples. -## -## Instead of plotting directly, you can ask for the returned processed -## vectors. If @var{y} has multiple channels, the plot should have the y-range -## [-1 2*size(y,2)-1]. scale specifies how much the matrix was scaled -## so that each signal would fit in the specified range. -## -## Since speech samples can be very long, we need a way to plot them -## rapidly. For long signals, auplot windows the data and keeps the -## minimum and maximum values in the window. Together, these values -## define the minimal polygon which contains the signal. The number of -## points in the polygon is set with the global variable auplot_points. -## The polygon may be either 'filled' or 'outline', as set by the global -## variable auplot_format. For moderately long data, the window does -## not contain enough points to draw an interesting polygon. In this -## case, simply choosing an arbitrary point from the window looks best. -## The global variable auplot_window sets the size of the window -## required for creating polygons. You can turn off the polygons -## entirely by setting auplot_format to 'sampled'. To turn off fast -## plotting entirely, set auplot_format to 'direct', or set -## auplot_points=1. There is no reason to do this since your screen -## resolution is limited and increasing the number of points plotted -## will not add any information. auplot_format, auplot_points and -## auplot_window may be set in .octaverc. By default auplot_format is -## 'outline', auplot_points=1000 and auplot_window=7. -## @end deftypefn - -## 2000-03 Paul Kienzle -## accept either row or column data -## implement fast plotting -## 2000-04 Paul Kienzle -## return signal and time vectors if asked - -## TODO: test offset and plotstr -## TODO: convert offset to time range in the form used by au -## TODO: rename to au; if nargout return data within time range -## TODO: otherwise plot the data -function [y_r, t_r, scale_r] = auplot(x, fs, offset, plotstr) - - global auplot_points=1000; - global auplot_format="outline"; - global auplot_window=7; - - if nargin<1 || nargin>4 - usage("[y, t, scale] = auplot(x [, fs [, offset [, plotstr]]])"); - endif - if nargin<2, fs = 8000; offset=0; plotstr = []; endif - if nargin<3, offset=0; plotstr = []; endif - if nargin<4, plotstr = []; endif - if ischar(fs), plotstr=fs; fs=8000; endif - if ischar(offset), plotstr=offset; offset=0; endif - if isempty(plotstr), plotstr=";;"; endif - - - if (size(x,1)<size(x,2)), x=x'; endif - - [samples, channels] = size(x); - r = ceil(samples/auplot_points); - c = floor(samples/r); - hastail = (samples>c*r); - - if r==1 || strcmp(auplot_format,"direct") - ## full plot - t=[0:samples-1]*1000/fs; - y=x; - elseif r<auplot_window || strcmp(auplot_format,"sampled") - ## sub-sampled plot - y=x(1:r:samples,:); - t=[0:size(y,1)-1]*1000*r/fs; - elseif strcmp(auplot_format,"filled") - ## filled plot - if hastail - t=zeros(2*(c+1),1); - y=zeros(2*(c+1),channels); - t(2*c+1)=t(2*c+2)=c*1000*r/fs; - else - t=zeros(2*c,1); - y=zeros(2*c,channels); - endif - t(1:2:2*c) = t(2:2:2*c) = [0:c-1]*1000*r/fs; - for chan=1:channels - head=reshape(x(1:r*c,chan),r,c); - y(1:2:2*c,chan) = max(head)'; - y(2:2:2*c,chan) = min(head)'; - if (hastail) - tail=x(r*c+1:samples,chan); - y(2*c+1,chan)=max(tail); - y(2*c+2,chan)=min(tail); - endif - endfor - elseif strcmp(auplot_format,"outline") - ## outline plot - if hastail - y=zeros(2*(c+1)+1,channels); - t=[0:c]; - else - y=zeros(2*c+1,channels); - t=[0:c-1]; - endif - t=[t, fliplr(t), 0]*1000*r/fs; - for chan=1:channels - head=reshape(x(1:r*c,chan),r,c); - if hastail - tail=x(r*c+1:samples,chan); - y(:,chan)=[max(head), max(tail), min(tail), ... - fliplr(min(head)), max(head(:,1))]'; - else - y(:,chan)=[max(head), fliplr(min(head)), max(head(:,1))]'; - endif - endfor - else - error("auplot_format must be 'outline', 'filled', 'sampled' or 'direct'"); - endif - - t=t+offset*1000/fs; - grid; - if channels > 1 - scale = max(abs(y(:))); - if (scale > 0) y=y/scale; endif - for i=1:channels - y(:,i) = y(:,i) + 2*(i-1); - end - else - scale = 1; - end - - if nargout >= 1, y_r = y; endif - if nargout >= 2, t_r = t; endif - if nargout >= 3, scale_r = scale; endif - if nargout == 0 - if channels > 1 - unwind_protect ## protect plot state - ylabel(sprintf('signal scaled by %f', scale)); - axis([min(t), max(t), -1, 2*channels-1]); - plot(t,y,plotstr); - unwind_protect_cleanup - axis(); ylabel(""); - end_unwind_protect - else - plot(t,y,plotstr); - end - endif -end - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! subplot(211); title("single channel"); auplot(x,fs); -%! subplot(212); title("2 channels, x and 3x"); auplot([x, 3*x], fs); -%! oneplot(); title(""); - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! global auplot_points; pts=auplot_points; -%! global auplot_format; fmt=auplot_format; -%! auplot_points=300; -%! subplot(221); title("filled"); auplot_format="filled"; auplot(x,fs); -%! subplot(223); title("outline"); auplot_format="outline"; auplot(x,fs); -%! auplot_points=900; -%! subplot(222); title("sampled"); auplot_format="sampled"; auplot(x,fs); -%! subplot(224); title("direct"); auplot_format="direct"; auplot(x,fs); -%! auplot_format=fmt; auplot_points=pts; title(""); oneplot(); - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! title("subrange example"); auplot(au(x,fs,300,450),fs) -%! title(""); - -%!error auplot -%!error auplot(1,2,3,4,5)
--- a/main/audio/inst/ausave.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,222 +0,0 @@ -## Copyright (C) 1999 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## usage: ausave('filename.ext', x, fs, format) -## -## Writes an audio file with the appropriate header. The extension on -## the filename determines the layout of the header. Currently supports -## .wav and .au layouts. Data is a matrix of audio samples in the -## range [-1,1] (inclusive), one row per time step, one column per -## channel. Fs defaults to 8000 Hz. Format is one of ulaw, alaw, uchar, -## short, long, float, double -## -## Note that translating the symmetric range [-1,1] into the asymmetric -## range [-2^n,2^n-1] requires a DC offset of -2/2^n. The inverse -## process used by auload requires a DC offset of 2/2^n, so loading and -## saving a file will not change the contents. Other applications may -## compensate for the asymmetry in a different way (including previous -## versions of auload/ausave) so you may find small differences in -## calculated DC offsets for the same file. - -function ausave (filename, data, rate = 8000, sampleformat = "int16") - - if (nargin < 2 || nargin > 4) - print_usage (); - elseif (! ischar (filename)) - error ("ausave: FILENAME must be a string"); - elseif (! isnumeric (data) || ndims (data) != 2) - error ("ausave: DATA must be a numeric 2D matrix"); - end - - ext = rindex (filename, '.'); - if (ext == 0) - error ("ausave: FILENAME `%s' has no extension", filename); - end - ext = tolower (substr (filename, ext+1, length (filename) -ext)); - - # determine data size and orientation - [samples, channels] = size (data); - if (samples < channels) - data = data.'; - [samples, channels] = size (data); - endif - - ## FIXME: should we give an error instead on input check? - ## Make sure the data fits into the sample range - scale = max (abs (data(:))); - if (scale > 1) - warning ("ausave: DATA exceeds range [-1,1] --- rescaling"); - data = data / scale; - endif - - ## Microsoft .wav format - if (strcmp (ext,'wav')) - - ## Header format obtained from sox/wav.c - ## April 15, 1992 - ## Copyright 1992 Rick Richardson - ## Copyright 1991 Lance Norskog And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Lance Norskog And Sundry Contributors are not responsible for - ## the consequences of using this software. - - switch (sampleformat) - case "uchar", formatid = 1; samplesize = 1; - case "short", formatid = 1; samplesize = 2; - case "long", formatid = 1; samplesize = 4; - case "float", formatid = 3; samplesize = 4; - case "double", formatid = 3; samplesize = 8; - case "alaw", formatid = 6; samplesize = 1; - case "ulaw", formatid = 7; samplesize = 1; - otherwise, error ("ausave: SAMPLEFORMAT `%s' is invalid for .wav file", sampleformat); - end - - datasize = channels*samplesize*samples; - - [file, msg] = fopen (filename, 'wb'); - if (file == -1) - error ("ausave: unable to fopen `%s' for writing: %s", filename, msg); - end - - ## write the magic header - arch = 'ieee-le'; - fwrite(file, toascii('RIFF'), 'int8'); - fwrite(file, datasize+36, 'int32', 0, arch); - fwrite(file, toascii('WAVE'), 'int8'); - - ## write the "fmt " section - fwrite(file, toascii('fmt '), 'int8'); - fwrite(file, 16, 'int32', 0, arch); - fwrite(file, formatid, 'int16', 0, arch); - fwrite(file, channels, 'int16', 0, arch); - fwrite(file, rate, 'int32', 0, arch); - fwrite(file, rate*channels*samplesize, 'int32', 0, arch); - fwrite(file, channels*samplesize, 'int16', 0, arch); - fwrite(file, samplesize*8, 'int16', 0, arch); - - ## write the "data" section - fwrite(file, toascii('data'), 'int8'); - fwrite(file, datasize, 'int32', 0, arch); - - ## Sun .au format - elseif (strcmp (ext, 'au')) - - ## Header format obtained from sox/au.c - ## September 25, 1991 - ## Copyright 1991 Guido van Rossum And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Guido van Rossum And Sundry Contributors are not responsible for - ## the consequences of using this software. - - switch (sampleformat) - case "ulaw", formatid = 1; samplesize = 1; - case "uchar", formatid = 2; samplesize = 1; - case "short", formatid = 3; samplesize = 2; - case "long", formatid = 5; samplesize = 4; - case "float", formatid = 6; samplesize = 4; - case "double", formatid = 7; samplesize = 8; - otherwise, error ("ausave: SAMPLEFORMAT `%s' is invalid for .au file", sampleformat); - end - - datasize = channels*samplesize*samples; - - [file, msg] = fopen (filename, 'wb'); - if (file == -1) - error ("ausave: unable to fopen `%s' for writing: %s", filename, msg); - end - - arch = 'ieee-be'; - fwrite(file, toascii('.snd'), 'int8'); - fwrite(file, 24, 'int32', 0, arch); - fwrite(file, datasize, 'int32', 0, arch); - fwrite(file, formatid, 'int32', 0, arch); - fwrite(file, rate, 'int32', 0, arch); - fwrite(file, channels, 'int32', 0, arch); - - ## Apple/SGI .aiff format - elseif (any (strcmp (ext, {"aiff", "aif"}))) - - ## Header format obtained from sox/aiff.c - ## September 25, 1991 - ## Copyright 1991 Guido van Rossum And Sundry Contributors - ## This source code is freely redistributable and may be used for - ## any purpose. This copyright notice must be maintained. - ## Guido van Rossum And Sundry Contributors are not responsible for - ## the consequences of using this software. - ## - ## IEEE 80-bit float I/O taken from - ## ftp://ftp.mathworks.com/pub/contrib/signal/osprey.tar - ## David K. Mellinger - ## dave@mbari.org - ## +1-831-775-1805 - ## fax -1620 - ## Monterey Bay Aquarium Research Institute - ## 7700 Sandholdt Road - - switch (sampleformat) - case "uchar", samplesize = 1; - case "short", samplesize = 2; - case "long", samplesize = 4; - otherwise, error ("ausave: SAMPLEFORMAT `%s' is invalid for .aiff file", sampleformat); - end - datasize = channels*samplesize*samples; - - [file, msg] = fopen (filename, 'wb'); - if (file == -1) - error ("ausave: unable to fopen `%s' for writing: %s", filename, msg); - end - - ## write the magic header - arch = 'ieee-be'; - fwrite(file, toascii('FORM'), 'int8'); - fwrite(file, datasize+46, 'int32', 0, arch); - fwrite(file, toascii('AIFF'), 'int8'); - - ## write the "COMM" section - fwrite(file, toascii('COMM'), 'int8'); - fwrite(file, 18, 'int32', 0, arch); - fwrite(file, channels, 'int16', 0, arch); - fwrite(file, samples, 'int32', 0, arch); - fwrite(file, 8*samplesize, 'int16', 0, arch); - fwrite(file, 16414, 'uint16', 0, arch); % sample rate exponent - fwrite(file, [rate, 0], 'uint32', 0, arch); % sample rate mantissa - - ## write the "SSND" section - fwrite(file, toascii('SSND'), 'int8'); - fwrite(file, datasize+8, 'int32', 0, arch); # section length - fwrite(file, 0, 'int32', 0, arch); # block size - fwrite(file, 0, 'int32', 0, arch); # offset - - ## file extension unknown - else - error ("ausave: unsupported extension `%s' in FILENAME `%s'", ext, filename); - end - - ## convert samples from range [-1, 1] - switch (sampleformat) - case "alaw" precision = "uint8"; error("FIXME: ausave needs linear to alaw conversion\n"); - case "ulaw" precision = "uint8"; data = lin2mu (data, 0); - case "uchar" precision = "uint8"; data = round ((data+1)*127.5); - case "short" precision = "in16"; data = round (data*32767.5 - 0.5); - case "long" precision = "int32"; data = round (data*(2^31-0.5) - 0.5); - otherwise, precision = sampleformat; - endswitch - - fwrite (file, data', precision, 0, arch); - fclose (file); - -endfunction
--- a/main/audio/inst/sound.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,159 +0,0 @@ -## Copyright (C) 1999-2000 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## usage: sound(x [, fs, bs]) -## -## Play the signal through the speakers. Data is a matrix with -## one column per channel. Rate fs defaults to 8000 Hz. The signal -## is clipped to [-1, 1]. Buffer size bs controls how many audio samples -## are clipped and buffered before sending them to the audio player. bs -## defaults to fs, which is equivalent to 1 second of audio. -## -## Note that if $DISPLAY != $HOSTNAME:n then a remote shell is opened -## to the host specified in $HOSTNAME to play the audio. See manual -## pages for ssh, ssh-keygen, ssh-agent and ssh-add to learn how to -## set it up. -## -## This function writes the audio data through a pipe to the program -## "play" from the sox distribution. sox runs pretty much anywhere, -## but it only has audio drivers for OSS (primarily linux and freebsd) -## and SunOS. In case your local machine is not one of these, write -## a shell script such as ~/bin/octaveplay, substituting AUDIO_UTILITY -## with whatever audio utility you happen to have on your system: -## #!/bin/sh -## cat > ~/.octave_play.au -## SYSTEM_AUDIO_UTILITY ~/.octave_play.au -## rm -f ~/.octave_play.au -## and set the global variable (e.g., in .octaverc) -## global sound_play_utility="~/bin/octaveplay"; -## -## If your audio utility can accept an AU file via a pipe, then you -## can use it directly: -## global sound_play_utility="SYSTEM_AUDIO_UTILITY flags" -## where flags are whatever you need to tell it that it is receiving -## an AU file. -## -## With clever use of the command dd, you can chop out the header and -## dump the data directly to the audio device in big-endian format: -## global sound_play_utility="dd of=/dev/audio ibs=2 skip=12" -## or little-endian format: -## global sound_play_utility="dd of=/dev/dsp ibs=2 skip=12 conv=swab" -## but you lose the sampling rate in the process. -## -## Finally, you could modify sound.m to produce data in a format that -## you can dump directly to your audio device and use "cat >/dev/audio" -## as your sound_play_utility. Things you may want to do are resample -## so that the rate is appropriate for your machine and convert the data -## to mulaw and output as bytes. -## -## If you experience buffer underruns while playing audio data, the bs -## buffer size parameter can be increased to tradeoff interactivity -## for smoother playback. If bs=Inf, then all the data is clipped and -## buffered before sending it to the audio player pipe. By default, 1 -## sec of audio is buffered. - -function sound(data, rate, buffer_size) - - if nargin<1 || nargin>3 - usage("sound(x [, fs, bs])"); - endif - if nargin<2 || isempty(rate), rate = 8000; endif - if nargin<3 || isempty(buffer_size), buffer_size = rate; endif - if rows(data) != length(data), data=data'; endif - [samples, channels] = size(data); - - ## Check if the octave engine is running locally by seeing if the - ## DISPLAY environment variable is empty or if it is the same as the - ## host name of the machine running octave. The host name is - ## taken from the HOSTNAME environment variable if it is available, - ## otherwise it is taken from the "uname -n" command. - display=getenv("DISPLAY"); - colon = rindex(display,":"); - if isempty(display) || colon==1 - islocal = 1; - else - if colon, display = display(1:colon-1); endif - host=getenv("HOSTNAME"); - if isempty(host), - [status, host] = system("uname -n"); - ## trim newline from end of hostname - if !isempty(host), host = host(1:length(host)-1); endif - endif - islocal = strcmp(tolower(host),tolower(display)); - endif - - ## What do we use for playing? - global sound_play_utility; - if ~isempty(sound_play_utility), - ## User specified command - elseif (file_in_path(EXEC_PATH, "ofsndplay")) - ## Mac - sound_play_utility = "ofsndplay -" - elseif (file_in_path(EXEC_PATH, "play")) - ## Linux (sox) - sound_play_utility = "play -t AU -"; - else - error("sound.m: No command line utility found for sound playing"); - endif - - ## If not running locally, then must use ssh to execute play command - if islocal - fid=popen(sound_play_utility, "w"); - else - fid=popen(["ssh ", host, " ", sound_play_utility], "w"); - end - if fid < 0, - warning("sound could not open play process"); - else - ## write sun .au format header to the pipe - fwrite(fid, toascii(".snd"), 'char'); - fwrite(fid, 24, 'int32', 0, 'ieee-be'); - fwrite(fid, -1, 'int32', 0, 'ieee-be'); - fwrite(fid, 3, 'int32', 0, 'ieee-be'); - fwrite(fid, rate, 'int32', 0, 'ieee-be'); - fwrite(fid, channels, 'int32', 0, 'ieee-be'); - - if isinf(buffer_size), - fwrite(fid, 32767*clip(data,[-1, 1])', 'int16', 0, 'ieee-be'); - else - ## write data in blocks rather than all at once - nblocks = ceil(samples/buffer_size); - block_start = 1; - for i=1:nblocks, - block_end = min(size(data,1), block_start+buffer_size-1); - fwrite(fid, 32767*clip(data(block_start:block_end,:),[-1, 1])', 'int16', 0, 'ieee-be'); - block_start = block_end + 1; - end - endif - pclose(fid); - endif -end - -###### auplay based version: not needed if using sox -## ## If not running locally, then must use ssh to execute play command -## global sound_play_utility="~/bin/auplay" -## if islocal -## fid=popen(sound_play_utility, "w"); -## else -## fid=popen(["ssh ", host, " ", sound_play_utility], "w"); -## end -## fwrite(fid, rate, 'int32'); -## fwrite(fid, channels, 'int32'); -## fwrite(fid, 32767*clip(data,[-1, 1])', 'int16'); -## pclose(fid); - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! sound(x,fs);
--- a/main/audio/inst/soundsc.m Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,65 +0,0 @@ -## Copyright (C) 2000 Paul Kienzle <pkienzle@users.sf.net> -## -## This program is free software; you can redistribute it and/or modify it under -## the terms of the GNU General Public License as published by the Free Software -## Foundation; either version 3 of the License, or (at your option) any later -## version. -## -## This program is distributed in the hope that it will be useful, but WITHOUT -## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more -## details. -## -## You should have received a copy of the GNU General Public License along with -## this program; if not, see <http://www.gnu.org/licenses/>. - -## usage: soundsc(x, fs, limit) or soundsc(x, fs, [ lo, hi ]) -## -## soundsc(x) -## Scale the signal so that [min(x), max(x)] -> [-1, 1], then -## play it through the speakers at 8000 Hz sampling rate. The -## signal has one column per channel. -## -## soundsc(x,fs) -## Scale the signal and play it at sampling rate fs. -## -## soundsc(x, fs, limit) -## Scale the signal so that [-|limit|, |limit|] -> [-1, 1], then -## play it at sampling rate fs. If fs is empty, then the default -## 8000 Hz sampling rate is used. -## -## soundsc(x, fs, [ lo, hi ]) -## Scale the signal so that [lo, hi] -> [-1, 1], then play it -## at sampling rate fs. If fs is empty, then the default 8000 Hz -## sampling rate is used. -## -## y=soundsc(...) -## return the scaled waveform rather than play it. -## -## See sound for more information. - -function data_r = soundsc(data, rate, range) - - if nargin < 1 || nargin > 3, usage("soundsc(x, fs, [lo, hi])") endif - if nargin < 2, rate = []; endif - if nargin < 3, range = [min(data(:)), max(data(:))]; endif - if isscalar(range), range = [-abs(range), abs(range)]; endif - - data=(data - mean(range))/((range(2)-range(1))/2); - if nargout > 0 - data_r = data; - else - sound(data, rate); - endif -endfunction - - -%!demo -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! soundsc(x,fs); - -%!shared y -%! [x, fs] = auload(file_in_loadpath("sample.wav")); -%! y=soundsc(x); -%!assert (min(y(:)), -1, eps) -%!assert (max(y(:)), 1, eps)
--- a/main/audio/src/.svnignore Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2 +0,0 @@ -autom4te.cache -configure
--- a/main/audio/src/Makeconf.in Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,71 +0,0 @@ - -## Makeconf is automatically generated from Makeconf.base and Makeconf.add -## in the various subdirectories. To regenerate, use ./autogen.sh to -## create a new ./Makeconf.in, then use ./configure to generate a new -## Makeconf. - -OCTAVE_FORGE = 1 - -SHELL = @SHELL@ - -canonical_host_type = @canonical_host_type@ -prefix = @prefix@ -exec_prefix = @exec_prefix@ -bindir = @bindir@ -mandir = @mandir@ -libdir = @libdir@ -datadir = @datadir@ -infodir = @infodir@ -includedir = @includedir@ -datarootdir = @datarootdir@ -INSTALL = @INSTALL@ -INSTALL_PROGRAM = @INSTALL_PROGRAM@ -INSTALL_SCRIPT = @INSTALL_SCRIPT@ -INSTALL_DATA = @INSTALL_DATA@ -INSTALLOCT=octinst.sh - -DESTDIR = - -RANLIB = @RANLIB@ -STRIP = @STRIP@ -LN_S = @LN_S@ -MKOCTLINK = @MKOCTLINK@ -OCTLINK= @OCTLINK@ - -AWK = @AWK@ - -# Most octave programs will be compiled with $(MKOCTFILE). Those which -# cannot use mkoctfile directly can request the flags that mkoctfile -# would use as follows: -# FLAG = $(shell $(MKOCTFILE) -p FLAG) -# The following flags are for compiling programs that are independent -# of Octave. How confusing. -CC = @CC@ -CFLAGS = @CFLAGS@ -CPPFLAGS = @CPPFLAGS@ -CPICFLAG = @CPICFLAG@ -CXX = @CXX@ -CXXFLAGS = @CXXFLAGS@ -CXXPICFLAG = @CXXPICFLAG@ -F77 = @F77@ -FFLAGS = @FFLAGS@ -FPICFLAG = @FPICFLAG@ - -OCTAVE = @OCTAVE@ -OCTAVE_VERSION = @OCTAVE_VERSION@ -MKOCTFILE = @MKOCTFILE@ -DHAVE_OCTAVE_$(ver) -v -SHLEXT = @SHLEXT@ - -ver = @ver@ -MPATH = @mpath@ -OPATH = @opath@ -XPATH = @xpath@ -ALTMPATH = @altmpath@ -ALTOPATH = @altopath@ - -@DEFHAVE_LINUX_SOUNDCARD@ - -%.o: %.c ; $(MKOCTFILE) -c $< -%.o: %.f ; $(MKOCTFILE) -c $< -%.o: %.cc ; $(MKOCTFILE) -c $< -%.oct: %.cc ; $(MKOCTFILE) $<
--- a/main/audio/src/Makefile Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,23 +0,0 @@ -sinclude Makeconf - -all: -ifdef HAVE_LINUX_SOUNDCARD - $(MAKE) -f Makefile.linux -endif # HAVE_LINUX_SOUNDCARD is not defined -ifeq (apple-darwin,$(findstring apple-darwin,$(canonical_host_type))) - $(MAKE) -f Makefile.macosx -endif # ifeq (apple-darwin) - -clean: - @echo "Cleaning..."; \ - $(RM) -rf *.o core octave-core ../bin/* *~ *.oct - -distclean: clean - @echo "Really Cleaning..."; \ - $(RM) -rf ../bin config.status config.log autom4te.cache Makeconf - -maintainer-clean realclean: distclean - @echo "Cleaning maintainer files..."; \ - $(RM) -rf ../bin configure - -dist : all
--- a/main/audio/src/Makefile.linux Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,18 +0,0 @@ -sinclude Makeconf - -ifneq (,$(findstring test,$(MAKECMDGOALS))) -CXXFLAGS := $(CXXFLAGS) -DTEST -TEST = -DTEST -endif - -all: aurecord.oct - $(MKOCTFILE) -DHAVE_CONFIG_H aurecord.cc endpoint.cc - -test: aurecord - -aurecord: aurecord.o endpoint.o - $(CXX) $(CXXFLAGS) -o $@ aurecord.o endpoint.o - -aurecord.o endpoint.o : endpoint.h - -%.o: %.cc ; $(MKOCTFILE) $(TEST) -c $<
--- a/main/audio/src/Makefile.macosx Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,9 +0,0 @@ -sinclude ./Makeconf - -all: ../bin/ofsndplay - -## Using explicit -ObjC flag since .cc is used to trick PKG_ADD to -## pick up the file as a CPP file. -../bin/ofsndplay: OFSndPlay.cc - mkdir -p ../bin - $(CC) -ObjC -o ../bin/ofsndplay OFSndPlay.cc -framework Cocoa
--- a/main/audio/src/OFSndPlay.cc Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,74 +0,0 @@ -// Author: Per Persson <persquare@users.sf.net> -// This program is granted to the public domain. - -/* ofsndplay - * Based on code by Chuck Bennet <chuck@benatong.com> - * and Matthew McCabe <mlm@escapement.net> - */ - -#import <Foundation/Foundation.h> -#import <AppKit/AppKit.h> - -@interface OFSoundPlayer:NSObject { -} -- (void)playFile:(NSString *)thePath; -- (void)playData:(NSData *)data; -- (void)sound:(NSSound *)sound didFinishPlaying:(BOOL)aBool; -@end - -@implementation OFSoundPlayer -- (void)playFile:(NSString *)thePath -{ - NSSound *sound = [[NSSound alloc] initWithContentsOfFile:thePath byReference:YES]; - [sound setDelegate: self]; - if([sound play] == YES) { - [[NSRunLoop currentRunLoop] run]; - } -} - -- (void)playData:(NSData *)data -{ - NSSound *sound = [[NSSound alloc] initWithData:data]; - [sound setDelegate: self]; - if([sound play] == YES) { - [[NSRunLoop currentRunLoop] run]; - } -} - -// According to the docs(?) the runloop should exit when the sound is -// finished. It doesn't. Instead, use this delegate method to exit -// the process. -- (void)sound:(NSSound *)sound didFinishPlaying:(BOOL)aBool -{ - exit(0); -} -@end - -int main (int argc, const char * argv[]) -{ - NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init]; - NSMutableData *soundData = [NSMutableData dataWithCapacity:10000]; - OFSoundPlayer *player = [[OFSoundPlayer alloc] init]; - - if(argc != 2 || (*argv[1] == '-' && strcmp(argv[1], "-") != 0)) - { - fprintf(stderr,"Usage: \t\'sndplay filename[.ext]\' or\n\t\'sndplay -\' to accept sound data via a pipe.\n"); - return -1; - } - - if(strcmp(argv[1], "-") == 0) { - // Read from pipe - NSFileHandle *readHandle = [NSFileHandle fileHandleWithStandardInput]; - NSData *inData = nil; - while ((inData = [readHandle availableData]) && [inData length]) { - [soundData appendData:inData]; - } - [player playData:soundData]; - } else { - // Read from file - [player playFile:[[NSString stringWithCString:argv[1]] stringByStandardizingPath]]; - } - // If we ever get here, the file/data was not a valid sound. - [pool release]; - return 0; -}
--- a/main/audio/src/aurecord.cc Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,555 +0,0 @@ -/* - * HISTORY: - * May, 1999 - separate audio open/close from wave play - * Feb. 1999 - first public release. - * - * Copyright 1999 Paul Kienzle <pkienzle@users.sf.net> - * This source code is freely redistributable and may be used for - * any purpose. This copyright notice must be maintained. - * Paul Kienzle is not responsible for the consequences of using - * this software. -## TODO: Support SGI, Sun and Windows devices -## TODO: Clean up user interaction, possibly adding GUI support - */ - -#include <stdio.h> -#include <stdlib.h> -#include <string.h> -#include <unistd.h> -#include <fcntl.h> -#include <sys/ioctl.h> -#include <signal.h> -#include "endpoint.h" - - -#ifdef TEST -#include <stdarg.h> - -void mymessage (const char *fmt, ...) -{ - va_list args; - va_start (args, fmt); - fprintf (stderr, fmt, args); - va_end (args); -} -#else -#include <octave/oct.h> -void mymessage (const char *fmt, ...) -{ - va_list args; - va_start (args, fmt); - message ("aurecord", fmt, args); - va_end (args); -} -#endif - -/* ==================================================================== */ -/* Input conversion routines (audio file -> machine representation) */ - -/* Read a 2 byte signed integer in little endian (Intel) format */ -static int from_S16_LE(char *buf, short *sample) -{ -#if __BYTE_ORDER == __BIG_ENDIAN - { - char t; - t = buf[0]; buf[0] = buf[1]; buf[1] = t; - } -#endif - *sample = *(short *)buf; - return 2; -} - -/* Read a 2 byte signed integer in big endian (non-Intel) format */ -static int from_S16_BE(char *buf, short *sample) -{ -#if __BYTE_ORDER == __LITTLE_ENDIAN - { - char t; - t = buf[0]; buf[0] = buf[1]; buf[1] = t; - } -#endif - *sample = *(short *)buf; - return 2; -} - - -/* Read a 2 byte unsigned integer in little endian (Intel) format */ -static int from_U16_LE(char *buf, short *sample) -{ -#if __BYTE_ORDER == __BIG_ENDIAN - { - char t; - t = buf[0]; buf[0] = buf[1]; buf[1] = t; - } -#endif - *sample = (short)((long)(*(unsigned short *)buf) - 32768); - return 2; -} - -/* Read a 2 byte unsigned integer in big endian (non-Intel) format */ -static int from_U16_BE(char *buf, short *sample) -{ -#if __BYTE_ORDER == __LITTLE_ENDIAN - { - char t; - t = buf[0]; buf[0] = buf[1]; buf[1] = t; - } -#endif - *sample = (short)((long)(*(unsigned short *)buf) - 32768); - return 2; -} - -/* Read a 1 byte aLaw compressed value and convert to 2 byte signed integer */ -static int from_A_LAW(char *buf, short *sample) -{ - static short alaw[] = { - -5504, -5248, -6016, -5760, -4480, -4224, -4992, -4736, - -7552, -7296, -8064, -7808, -6528, -6272, -7040, -6784, - -2752, -2624, -3008, -2880, -2240, -2112, -2496, -2368, - -3776, -3648, -4032, -3904, -3264, -3136, -3520, -3392, - -22016, -20992, -24064, -23040, -17920, -16896, -19968, -18944, - -30208, -29184, -32256, -31232, -26112, -25088, -28160, -27136, - -11008, -10496, -12032, -11520, -8960, -8448, -9984, -9472, - -15104, -14592, -16128, -15616, -13056, -12544, -14080, -13568, - -344, -328, -376, -360, -280, -264, -312, -296, - -472, -456, -504, -488, -408, -392, -440, -424, - -88, -72, -120, -104, -24, -8, -56, -40, - -216, -200, -248, -232, -152, -136, -184, -168, - -1376, -1312, -1504, -1440, -1120, -1056, -1248, -1184, - -1888, -1824, -2016, -1952, -1632, -1568, -1760, -1696, - -688, -656, -752, -720, -560, -528, -624, -592, - -944, -912, -1008, -976, -816, -784, -880, -848 }; - unsigned char t; - - t = *(unsigned char *)buf; - if (t>=128) *sample = -alaw[t&0x7F]; - else *sample = alaw[t&0x7F]; - return 1; -} - -/* Read a 1 byte uLaw compressed value and convert to 2 byte signed integer */ -static int from_MU_LAW(char *buf, short *sample) -{ - static short ulaw[] = { - -32124, -31100, -30076, -29052, -28028, -27004, -25980, -24956, - -23932, -22908, -21884, -20860, -19836, -18812, -17788, -16764, - -15996, -15484, -14972, -14460, -13948, -13436, -12924, -12412, - -11900, -11388, -10876, -10364, -9852, -9340, -8828, -8316, - -7932, -7676, -7420, -7164, -6908, -6652, -6396, -6140, - -5884, -5628, -5372, -5116, -4860, -4604, -4348, -4092, - -3900, -3772, -3644, -3516, -3388, -3260, -3132, -3004, - -2876, -2748, -2620, -2492, -2364, -2236, -2108, -1980, - -1884, -1820, -1756, -1692, -1628, -1564, -1500, -1436, - -1372, -1308, -1244, -1180, -1116, -1052, -988, -924, - -876, -844, -812, -780, -748, -716, -684, -652, - -620, -588, -556, -524, -492, -460, -428, -396, - -372, -356, -340, -324, -308, -292, -276, -260, - -244, -228, -212, -196, -180, -164, -148, -132, - -120, -112, -104, -96, -88, -80, -72, -64, - -56, -48, -40, -32, -24, -16, -8, 0}; - unsigned char t; - - t = *(unsigned char *)buf; - if (t>=128) *sample = -ulaw[t&0x7F]; - else *sample = ulaw[t&0x7F]; - return 1; -} - -/* Read a 1 byte unsigned value and convert to 2 byte signed integer */ -static int from_U8(char *buf, short *sample) -{ - unsigned char t; - - t = *(unsigned char *)buf; - *sample = (t-128)<<8; - return 1; -} - -/* Read a 1 byte unsigned value and convert to 2 byte signed integer */ -static int from_S8(char *buf, short *sample) -{ - unsigned char t; - - t = *(unsigned char *)buf; - *sample = t<<8; - return 1; -} - -/* ===================================================================== */ -/* Audio device routines */ - -/* Okay, now for the OS specific audio code: - * - * audioopen(int rate, int channels) returns true if the audio device - * has been opened. This routine must set the global variables - * audiorate and audiochannels to the actual rate and channels - * selected for the device which may be different from those - * requested. This routine must also set audioconvert, the function - * which takes the machine representation for samples (2 byte signed - * integers) and converts them to the audio format specified for the - * audio device. - * - * audioplay(void *data, int length) returns true if data was played. - * The data has already been converted to the correct rate, number of - * channels and audio format for the device. The length is the number - * of BYTES to play (not the number of samples). - * - * audioclose() closes the audio device. */ - -typedef int (*CONVERSION)(char *buf, short *sample); -static CONVERSION audioconvert; -static int audiorate; -static int audiochannels; - -/* ==================================================================== */ -#if 1 /* LINUX OSS audio drivers */ -#include <linux/soundcard.h> - -static int audio = -1; -int audioopen(int rate, int channels) -{ - int format, outformat, mask; - - /* Open audio device */ - audio = open("/dev/dsp", O_RDONLY); - if (audio < 0) return -1; - - /* Set channels (mono vs. stereo) and remember what was set */ - --channels; - if (ioctl(audio, SNDCTL_DSP_STEREO, &channels) < 0) goto error; - audiochannels = channels+1; - - /* Set input format. Convert to a format which preserves the most - * bits if the selected format is unavailable. - */ -#if __BYTE_ORDER == __LITTLE_ENDIAN - outformat = format = AFMT_S16_LE, audioconvert=from_S16_LE; -#else - outformat = format = AFMT_S16_BE, audioconvert=from_S16_BE; -#endif - if (ioctl(audio, SNDCTL_DSP_SETFMT, &outformat) < 0) goto error; - if (outformat != format) { - if (ioctl(audio, SNDCTL_DSP_GETFMTS, &mask) < 0) goto error; - if (mask&AFMT_S16_LE) format = AFMT_S16_LE, audioconvert=from_S16_LE; - else if (mask&AFMT_S16_BE) format = AFMT_S16_BE, audioconvert=from_S16_BE; - else if (mask&AFMT_U16_LE) format = AFMT_U16_LE, audioconvert=from_U16_LE; - else if (mask&AFMT_U16_BE) format = AFMT_U16_BE, audioconvert=from_U16_BE; - else if (mask&AFMT_MU_LAW) format = AFMT_MU_LAW, audioconvert=from_MU_LAW; - else if (mask&AFMT_A_LAW) format = AFMT_A_LAW, audioconvert=from_A_LAW; - else if (mask&AFMT_U8) format = AFMT_U8, audioconvert=from_U8; - else if (mask&AFMT_S8) format = AFMT_S8, audioconvert=from_S8; - else goto error; - if (ioctl(audio, SNDCTL_DSP_SETFMT, &format) < 0) goto error; - } - - /* Set sample rate and remember what was set. */ - if (ioctl(audio, SNDCTL_DSP_SPEED, &rate) < 0) goto error; - audiorate = rate; - return 1; - -error: - close(audio); - return 0; -} - -static short audiosample() -{ - static char buf[2048]; - static int bufpos = sizeof(buf); - int len; - short sample; - - if (bufpos >= sizeof(buf)) { - len = read(audio, buf, sizeof(buf)); - while (len < sizeof(buf)) buf[len++] = 0; - bufpos = 0; - } - bufpos += (*audioconvert)(buf+bufpos, &sample); - return sample; -} - -void audioclose() -{ - close(audio); - audio = -1; -} - -void audioabort() -{ - if (audio != -1) { - ioctl(audio, SNDCTL_DSP_RESET, NULL); - audioclose(); - } -} -#endif - -void inform(const char *str) -{ - if (str != NULL) { -#if 0 - mymessage ("\r%-38s", str); -#else - mymessage ("%s\n", str); -#endif - } - else - mymessage ("\n"); -} - -int capture(int rate, short *capturebuf, int capturelen) -{ - // Note: initial silence is WINDOW+2*STEP - const float STEP=0.010; // step size in sec - const float WINDOW=0.016; // window size in sec - const long ENDSILENCE=700; // duration of end silence in msec - const long MINLENGTH=300; // minimum utterance in msec - - endpointer *ep; - int framelen, framestep; - short *frame; - int framenumber=0; /* Currently active frame number */ - int framepos = 0; - int capturepos, captureend, remaining; - EPTAG tag, state=EP_RESET; - - /* initialize capture */ - framelen = (int)(WINDOW*(float)rate); - framestep = (int)(STEP*(float)rate); - frame = new short[framelen]; - ep = new endpointer(rate, framestep, framelen, ENDSILENCE, MINLENGTH); - - while (1) { - /* Fill the next frame */ - while (framepos < framelen) frame[framepos++] = audiosample(); - framenumber++; - - /* Process frame through the end point detector */ - tag = ep -> getendpoint (frame);// get endpoint tag -#if 0 - mymessage (" tag=%s, state=%s\n", - ep->gettagname(tag), ep->gettagname(state)); -#endif - switch (tag) { // determine what to do with this frame - case EP_NOSTARTSILENCE: // error condition --- restart process - if (tag == EP_NOSTARTSILENCE) - inform("Spoke too soon. Wait a bit and try again..."); - ep->initendpoint(); - framenumber = 0; - // fall through to RESET - - case EP_RESET: // false start --- restart recognizer - // fall through to SILENCE - - case EP_SILENCE: // not yet start of utterance - if (state != EP_SILENCE && framenumber > 3) { - inform("Waiting for you to speak..."); - state = EP_SILENCE; - } - capturepos = 0; - break; - - case EP_MAYBEEND: // possible end of utterance - if (tag == EP_MAYBEEND) captureend = capturepos; - // fall through to SIGNAL - - - case EP_NOTEND: // the last MAYBEEND was NOT the end - if (tag == EP_NOTEND) captureend = 0; - // fall through to SIGNAL - - case EP_INUTT: // confirmed signal start - // all data frames before this marked as EP_SIGNAL were part - // of the actual utterance. A reset after this point will be - // due to a rejected signal rather than a false start. - if (state != EP_INUTT) { - inform("Capturing your speech..."); - state = EP_INUTT; - } - // fall through to SIGNAL - - case EP_SIGNAL: // signal frame - // Copy frame into capture buf. - remaining = capturelen - capturepos; - if (remaining > framestep) remaining = framestep; - if (remaining > 0) - memcpy(capturebuf+capturepos, frame, remaining*sizeof(*frame)); - capturepos += remaining; - - // Check for end of capture buf. - if (capturepos == capturelen) { - if (captureend == 0) captureend = capturepos; - inform("Speech exceeded capture duration. Use -t to increase."); - inform(NULL); - return captureend; - } - break; - - case EP_ENDOFUTT: // confirmed end of utterance - // This is a silence frame after the end of signal. The previous - // MAYBEEND frame was the actual end of utterance - inform(NULL); - return captureend; - } - - /* Shift the frame overlap to the start of the frame. */ - framepos = framelen - framestep; - memmove(frame, frame+framestep, framepos*sizeof(*frame)); - } - - return 0; -} - - -void cleanup(int sig) -{ - audioabort(); - exit(2); -} - -#ifdef TEST - -int main(int argc, char *argv[]) -{ - int do_endpoint = 0; - int rate=16000, channels=1; - double time=1; - short *buf; - int i, c, samples; - - - /* Interpret options */ - do { - c = getopt(argc, argv, "et:r:c:?"); - switch (c) { - case 'e': do_endpoint = 1; break; - case 'r': rate = atoi(optarg); break; - case 'c': channels = atoi(optarg); break; - case 't': time = atof(optarg); break; - case '?': - fprintf (stderr, "usage: aurecord [-t time] [-r rate] [-c channels]\n"); - exit(1); - } - } while (c != EOF); - - /* Prepare for interrupt. */ - signal(SIGINT, cleanup); - - /* open audio device and skip the first bunch of samples */ - if (audioopen(rate, channels) < 0) return 1; - for (i = 0; i < 1024; i++) audiosample(); - - fwrite(&audiorate, 4, 1, stdout); - fwrite(&audiochannels, 4, 1, stdout); - samples = (long)((double)audiorate * time)*audiochannels; - buf = new short[samples]; - - if (do_endpoint) { - /* wait for audio event before grabbing samples */ - samples = capture(audiorate, buf, samples); - } - else { - /* grab all the samples you need directly */ - for (i = 0; i < samples; i++) buf[i] = audiosample(); - } - - /* close the audio device */ - audioclose(); - - /* output the captured samples */ - fwrite(buf, 2, samples, stdout); - return 0; -} - -#else - -DEFUN_DLD (aurecord, args, nargout, - "-*- texinfo -*-\n\ -@deftypefn {Loadable Function} {[@var{x}, @var{fs}, @var{chan}] =} aurecord (@var{t}, @var{fs}, @var{chan})\n\ -@deftypefnx {Loadable Function} {[@var{x}, @var{fs}, @var{chan}] =} aurecord (@var{t}, @var{fs}, @var{chan}, 'endpoint')\n\ -\n\ -Record for the specified time at the given sample rate. Note that\n\ -the sample rate used may not match the requested sample rate. Use\n\ -the returned rate instead of the requested value in further\n\ -processing. Similarly, the actual number of samples and channels\n\ -may not match the request, so check the size of the returned matrix.\n\ -\n\ -@var{fs} defaults to 8000 Hz and @var{chan} defaults to 1. @var{time} is\n\ -measured in seconds. @code{aurecord} can return the actual number of\n\ -channels and the rate that is used, that may different from the ones\n\ -selected.\n\ -\n\ -If the argument 'endpoint' is given, we attempt to wait for audio event\n\ -before grabbing samples\n\ -@end deftypefn") -{ - int nargin = args.length (); - octave_value_list retval; - - if (nargin < 1 || nargin > 4) - print_usage (); - else - { - double time = args (0).double_value (); - int rate = 16000; - int channels = 1; - int do_endpoint = 0; - short *buf; - int i, c, samples; - - - if (nargin > 1) - rate = args (1).nint_value (); - - if (nargin > 2) - channels = args (2).nint_value (); - - if (nargin > 3) - { - std::string arg = args(3).string_value (); - if (arg == "endpoint") - do_endpoint = 1; - } - - if (! error_state) - { - /* Prepare for interrupt. */ - signal(SIGINT, cleanup); - - /* open audio device and skip the first bunch of samples */ - if (audioopen (rate, channels) < 0) - error ("aurecord: can not open device"); - - for (i = 0; i < 1024; i++) - audiosample(); - - retval (2) = octave_value (audiochannels); - retval (1) = octave_value (audiorate); - - samples = (long)((double)audiorate * time)*audiochannels; - OCTAVE_LOCAL_BUFFER (short, buf, samples); - - if (do_endpoint) { - /* wait for audio event before grabbing samples */ - samples = capture(audiorate, buf, samples); - } - else { - /* grab all the samples you need directly */ - for (i = 0; i < samples; i++) buf[i] = audiosample(); - } - - /* close the audio device */ - audioclose(); - - /* output the captured samples */ - Matrix buf2 (samples / audiochannels, audiochannels); - for (i = 0; i < samples; i++) - buf2.xelem (i) = static_cast <double> (buf[i]) / 32768.; - - retval(0) = buf2; - } - } - - return retval; -} - -#endif
--- a/main/audio/src/autogen.sh Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,27 +0,0 @@ -#! /bin/sh - -## Generate ./configure -rm -f configure.in -echo "dnl --- DO NOT EDIT --- Automatically generated by autogen.sh" > configure.in -cat configure.base >> configure.in -cat <<EOF >> configure.in - AC_OUTPUT(\$CONFIGURE_OUTPUTS) - dnl XXX FIXME XXX chmod is not in autoconf's list of portable functions - - echo " " - echo " \"\\\$prefix\" is \$prefix" - echo " \"\\\$exec_prefix\" is \$exec_prefix" - AC_MSG_RESULT([\$STATUS_MSG - -find . -name NOINSTALL -print # shows which toolboxes won't be installed -]) -EOF - -autoconf configure.in > configure.tmp -if [ diff configure.tmp configure > /dev/null 2>&1 ]; then - rm -f configure.tmp; -else - mv -f configure.tmp configure - chmod 0755 configure -fi -rm -f configure.in
--- a/main/audio/src/configure.base Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,330 +0,0 @@ -dnl The configure script is generated by autogen.sh from configure.base -dnl and the various configure.add files in the source tree. Edit -dnl configure.base and reprocess rather than modifying ./configure. - -dnl autoconf 2.13 certainly doesn't work! What is the minimum requirement? -AC_PREREQ(2.2) - -AC_INIT(configure.base) - -PACKAGE=octave-forge -MAJOR_VERSION=0 -MINOR_VERSION=1 -PATCH_LEVEL=0 - -dnl Kill caching --- this ought to be the default -define([AC_CACHE_LOAD], )dnl -define([AC_CACHE_SAVE], )dnl - -dnl uncomment to put support files in another directory -dnl AC_CONFIG_AUX_DIR(admin) - -VERSION=$MAJOR_VERSION.$MINOR_VERSION.$PATCH_LEVEL -AC_SUBST(PACKAGE) -AC_SUBST(VERSION) - -dnl need to find admin files, so keep track of the top dir. -TOPDIR=`pwd` -AC_SUBST(TOPDIR) - -dnl if mkoctfile doesn't work, then we need the following: -dnl AC_PROG_CXX -dnl AC_PROG_F77 - -dnl Need C compiler regardless so define it in a way that -dnl makes autoconf happy and we can override whatever we -dnl need with mkoctfile -p. -dnl XXX FIXME XXX should use mkoctfile to get CC and CFLAGS -AC_PROG_CC - -dnl XXX FIXME XXX need tests for -p -c -s in mkoctfile. - -dnl ******************************************************************* -dnl Sort out mkoctfile version number and install paths - -dnl XXX FIXME XXX latest octave has octave-config so we don't -dnl need to discover things here. Doesn't have --exe-site-dir -dnl but defines --oct-site-dir and --m-site-dir - -dnl Check for mkoctfile -AC_CHECK_PROG(MKOCTFILE,mkoctfile,mkoctfile) -test -z "$MKOCTFILE" && AC_MSG_WARN([no mkoctfile found on path]) - -AC_SUBST(ver) -AC_SUBST(subver) -AC_SUBST(mpath) -AC_SUBST(opath) -AC_SUBST(xpath) -AC_SUBST(altpath) -AC_SUBST(altmpath) -AC_SUBST(altopath) - -AC_ARG_WITH(path, - [ --with-path install path prefix], - [ path=$withval ]) -AC_ARG_WITH(mpath, - [ --with-mpath override path for m-files], - [mpath=$withval]) -AC_ARG_WITH(opath, - [ --with-opath override path for oct-files], - [opath=$withval]) -AC_ARG_WITH(xpath, - [ --with-xpath override path for executables], - [xpath=$withval]) -AC_ARG_WITH(altpath, - [ --with-altpath alternative functions install path prefix], - [ altpath=$withval ]) -AC_ARG_WITH(altmpath, - [ --with-altmpath override path for alternative m-files], - [altmpath=$withval]) -AC_ARG_WITH(altopath, - [ --with-altopath override path for alternative oct-files], - [altopath=$withval]) - -if test -n "$path" ; then - test -z "$mpath" && mpath=$path - test -z "$opath" && opath=$path/oct - test -z "$xpath" && xpath=$path/bin - test -z "$altpath" && altpath=$path-alternatives -fi - -if test -n "$altpath" ; then - test -z "$altmpath" && altmpath=$altpath - test -z "$altopath" && altopath=$altpath/oct -fi - -dnl Don't query if path/ver are given in the configure environment -#if test -z "$mpath" || test -z "$opath" || test -z "$xpath" || test -z "$altmpath" || test -z "$altopath" || test -z "$ver" ; then -if test -z "$mpath" || test -z "$opath" || test -z "$xpath" || test -z "$ver" ; then - dnl Construct program to get mkoctfile version and local install paths - cat > conftest.cc <<EOF -#include <octave/config.h> -#include <octave/version.h> -#include <octave/defaults.h> - -#define INFOV "\nINFOV=" OCTAVE_VERSION "\n" - -#define INFOH "\nINFOH=" OCTAVE_CANONICAL_HOST_TYPE "\n" - -#ifdef OCTAVE_LOCALVERFCNFILEDIR -# define INFOM "\nINFOM=" OCTAVE_LOCALVERFCNFILEDIR "\n" -#else -# define INFOM "\nINFOM=" OCTAVE_LOCALFCNFILEPATH "\n" -#endif - -#ifdef OCTAVE_LOCALVEROCTFILEDIR -# define INFOO "\nINFOO=" OCTAVE_LOCALVEROCTFILEDIR "\n" -#else -# define INFOO "\nINFOO=" OCTAVE_LOCALOCTFILEPATH "\n" -#endif - -#ifdef OCTAVE_LOCALVERARCHLIBDIR -# define INFOX "\nINFOX=" OCTAVE_LOCALVERARCHLIBDIR "\n" -#else -# define INFOX "\nINFOX=" OCTAVE_LOCALARCHLIBDIR "\n" -#endif - -const char *infom = INFOM; -const char *infoo = INFOO; -const char *infox = INFOX; -const char *infoh = INFOH; -const char *infov = INFOV; -EOF - - dnl Compile program perhaps with a special version of mkoctfile - $MKOCTFILE conftest.cc || AC_MSG_ERROR(Could not run $MKOCTFILE) - - dnl Strip the config info from the compiled file - eval `strings conftest.o | grep "^INFO.=" | sed -e "s,//.*$,,"` - rm -rf conftest* - - dnl set the appropriate variables if they are not already set - ver=`echo $INFOV | sed -e "s/\.//" -e "s/\..*$//"` - subver=`echo $INFOV | sed -e "[s/^[^.]*[.][^.]*[.]//]"` - alt_mbase=`echo $INFOM | sed -e "[s,\/[^\/]*$,,]"` - alt_obase=`echo $INFOO | sed -e "[s,/site.*$,/site,]"` - test -z "$mpath" && mpath=$INFOM/octave-forge - test -z "$opath" && opath=$INFOO/octave-forge - test -z "$xpath" && xpath=$INFOX - test -z "$altmpath" && altmpath=$alt_mbase/octave-forge-alternatives/m - test -z "$altopath" && altopath=$alt_obase/octave-forge-alternatives/oct/$INFOH -fi - -dnl ******************************************************************* - -dnl XXX FIXME XXX Should we allow the user to override these? -dnl Do we even need them? The individual makefiles can call mkoctfile -p -dnl themselves, so the only reason to keep them is for configure, and -dnl for those things which are not built using mkoctfile (e.g., aurecord) -dnl but it is not clear we should be using octave compile flags for those. - -dnl C compiler and flags -AC_MSG_RESULT([retrieving compile and link flags from $MKOCTFILE]) -CC=`$MKOCTFILE -p CC` -CFLAGS=`$MKOCTFILE -p CFLAGS` -CPPFLAGS=`$MKOCTFILE -p CPPFLAGS` -CPICFLAG=`$MKOCTFILE -p CPICFLAG` -LDFLAGS=`$MKOCTFILE -p LDFLAGS` -LIBS=`$MKOCTFILE -p LIBS` -AC_SUBST(CC) -AC_SUBST(CFLAGS) -AC_SUBST(CPPFLAGS) -AC_SUBST(CPICFLAG) - -dnl Fortran compiler and flags -F77=`$MKOCTFILE -p F77` -FFLAGS=`$MKOCTFILE -p FFLAGS` -FPICFLAG=`$MKOCTFILE -p FPICFLAG` -AC_SUBST(F77) -AC_SUBST(FFLAGS) -AC_SUBST(FPICFLAG) - -dnl C++ compiler and flags -CXX=`$MKOCTFILE -p CXX` -CXXFLAGS=`$MKOCTFILE -p CXXFLAGS` -CXXPICFLAG=`$MKOCTFILE -p CXXPICFLAG` -AC_SUBST(CXX) -AC_SUBST(CXXFLAGS) -AC_SUBST(CXXPICFLAG) - -dnl ******************************************************************* - -dnl Check for features of your version of mkoctfile. -dnl All checks should be designed so that the default -dnl action if the tests are not performed is to do whatever -dnl is appropriate for the most recent version of Octave. - -dnl Define the following macro: -dnl OF_CHECK_LIB(lib,fn,true,false,helpers) -dnl This is just like AC_CHECK_LIB, but it doesn't update LIBS -AC_DEFUN(OF_CHECK_LIB, -[save_LIBS="$LIBS" -AC_CHECK_LIB($1,$2,$3,$4,$5) -LIBS="$save_LIBS" -]) - -dnl Define the following macro: -dnl TRY_MKOCTFILE(msg,program,action_if_true,action_if_false) -dnl -AC_DEFUN(TRY_MKOCTFILE, -[AC_MSG_CHECKING($1) -cat > conftest.cc << EOF -#include <octave/config.h> -$2 -EOF -ac_try="$MKOCTFILE -c conftest.cc" -if AC_TRY_EVAL(ac_try) ; then - AC_MSG_RESULT(yes) - $3 -else - AC_MSG_RESULT(no) - $4 -fi -]) - -dnl -dnl Check if F77_FUNC works with MKOCTFILE -dnl -TRY_MKOCTFILE([for F77_FUNC], -[int F77_FUNC (hello, HELLO) (const int &n);],, -[MKOCTFILE="$MKOCTFILE -DF77_FUNC=F77_FCN"]) - -dnl ********************************************************** - -dnl Evaluate an expression in octave -dnl -dnl OCTAVE_EVAL(expr,var) -> var=expr -dnl -AC_DEFUN(OCTAVE_EVAL, -[AC_MSG_CHECKING([for $1 in Octave]) -$2=`echo "disp($1)" | $OCTAVE -qf` -AC_MSG_RESULT($$2) -AC_SUBST($2) -]) - -dnl Check status of an octave variable -dnl -dnl OCTAVE_CHECK_EXIST(variable,action_if_true,action_if_false) -dnl -AC_DEFUN(OCTAVE_CHECK_EXIST, -[AC_MSG_CHECKING([for $1 in Octave]) -if test `echo 'disp(exist("$1"))' | $OCTAVE -qf`X != 0X ; then - AC_MSG_RESULT(yes) - $2 -else - AC_MSG_RESULT(no) - $3 -fi -]) - -dnl should check that $(OCTAVE) --version matches $(MKOCTFILE) --version -AC_CHECK_PROG(OCTAVE,octave,octave) -OCTAVE_EVAL(OCTAVE_VERSION,OCTAVE_VERSION) - -dnl grab canonical host type so we can write system specific install stuff -OCTAVE_EVAL(octave_config_info('canonical_host_type'),canonical_host_type) - -dnl grab SHLEXT from octave config -OCTAVE_EVAL(octave_config_info('SHLEXT'),SHLEXT) - -AC_PROG_LN_S -AC_PROG_RANLIB - -dnl Use $(COPY_FLAGS) to set options for cp when installing .oct files. -COPY_FLAGS="-Rfp" -case "$canonical_host_type" in - *-*-linux*) - COPY_FLAGS="-fdp" - ;; -esac -AC_SUBST(COPY_FLAGS) - -dnl Use $(STRIP) in the makefile to strip executables. If not found, -dnl STRIP expands to ':', which in the makefile does nothing. -dnl Don't need this for .oct files since mkoctfile handles them directly -STRIP=${STRIP-strip} -AC_CHECK_PROG(STRIP,$STRIP,$STRIP,:) - -dnl Strip on windows, don't strip on Mac OS/X or IRIX -dnl For the rest, you can force strip using MKOCTFILE="mkoctfile -s" -dnl or avoid strip using STRIP=: before ./configure -case "$canonical_host_type" in - powerpc-apple-darwin*|*-sgi-*) - STRIP=: - ;; - *-cygwin-*|*-mingw-*) - MKOCTFILE="$MKOCTFILE -s" - ;; -esac - - -AC_DEFINE(have_oss) -AC_CHECK_HEADER(linux/soundcard.h, have_oss=yes, have_oss=no) -if test $have_oss = yes ; then - OSS_STATUS="yes" - AC_SUBST(DEFHAVE_LINUX_SOUNDCARD) - DEFHAVE_LINUX_SOUNDCARD="HAVE_LINUX_SOUNDCARD=1" -else - OSS_STATUS="linux/soundcard.h not found" -fi - -CONFIGURE_OUTPUTS="Makeconf" -STATUS_MSG=" -octave commands will install into the following directories: - m-files: $mpath - oct-files: $opath - binaries: $xpath -alternatives: - m-files: $altmpath - oct-files: $altopath - -shell commands will install into the following directories: - binaries: $bindir - man pages: $mandir - libraries: $libdir - headers: $includedir - -octave-forge is configured with - octave: $OCTAVE (version $OCTAVE_VERSION) - mkoctfile: $MKOCTFILE for Octave $subver - audio capture: $OSS_STATUS"
--- a/main/audio/src/endpoint.cc Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,441 +0,0 @@ -// Author: Bruce T. Lowerre -// This program is granted to the public domain. - -/* - * ENDPOINT.CC - The endpoint class member routines. - * - * The endpointer is used to determine the start and end of a live - * input signal. Unlike a pre-recorded utterance, a live input signal - * is open-ended in that the actual start and end of the signal is - * totally unknown. The search will usually do a fairly good job of - * guessing the start of the signal. However, the actual end of the - * signal is unknown to the recognizer. Reaching the end state in the - * recognizer does not necessarily mean the end of signal. Therefore, - * the end of signal must be calculated by some means. This is the - * job of the end point detector. This module is accessed via a class - * structure. It should be called for each frame of data to determine - * what processing should be done. - * - * The endpointer uses "cheap" signal processing features (energy and - * zero cross count) and is intended to run constantly on a host - * processor without the need of a DSP or high speed processor. When - * the start of the utterance is detected, then the expensive search - * can be called. - * - * The endpointer is designed to run with a real-time processing - * search. That means that the live input signal is processed in - * real-time while it's being read. Therefore, the start of signal - * will occur (and the search will start) before the entire utterance - * has been read. The ramifications of this is that the endpointer - * has to guess as to the possible start and end of utterance. These - * guesses, frame labels, are used by other modules to guide the - * utterance capture and search. The endpointer may realize that it - * has mis-labeled either the start of utterance or the end of - * utterance. When this happens, a special frame label (either - * EP_RESET if a false start was detected or EP_NOTEND if a false end - * was detected) is returned. - * - * The algorithms used in this module have evolved from 20 years of - * work with live input signals. */ - - -#include <iostream> -#include <cmath> -#include "endpoint.h" -using namespace std; - - -/* ENDPOINTER::ENDPOINTER - class constructor, set initial values */ -endpointer::endpointer -( - long d_samprate, // sampling rate in Hz - long d_windowsize, // windowsize in samples - long d_stepsize, // step size in samples - long d_maxipause, // default ending silence in msec - long d_minuttlng, // default minuttlng in msec - long d_zcthresh, // default zcthresh, Hz - float d_begfact, // default begfact - float d_endfact, // default endfact - float d_energyfact, // default energyfact - float d_minstartsilence, // default minstartsilence - float d_triggerfact, // default triggerfact - long d_numdpnoise, // default numdpnoise - long d_minfriclng, // default minfriclng in msec - long d_maxpause, // default maxpause in msec - long d_startblip, // default startblip in msec - long d_endblip, // default endblip in msec - long d_minvoicelng, // default minvoicelng in msec - long d_minrise // default minrise in msec -) -{ - long i; - - samprate = d_samprate; - windowsize = d_windowsize; - stepsize = d_stepsize; - maxipause = (d_maxipause * samprate) / (1000 * stepsize); // num steps - minuttlng = (d_minuttlng * samprate) / (1000 * stepsize); // num steps - zcthresh = (d_zcthresh * stepsize) / samprate; // per frame - begfact = d_begfact; - endfact = d_endfact; - energyfact = d_energyfact; - minstartsilence = d_minstartsilence; - numdpnoise = d_numdpnoise; - triggerfact = d_triggerfact; - minfriclng = (d_minfriclng * samprate) / (1000 * stepsize); // num steps - maxpause = (d_maxpause * samprate) / (1000 * stepsize); // num steps - startblip = (d_startblip * samprate) / (1000 * stepsize); // num steps - endblip = (d_endblip * samprate) / (1000 * stepsize); // num steps - minvoicelng = (d_minvoicelng * samprate) / (1000 * stepsize); // num steps - minrise = (d_minrise * samprate) / (1000 * stepsize); // num steps - lastdpnoise = new float[numdpnoise]; - for (i = 0; i < numdpnoise; i++) - lastdpnoise[i] = 0.0; - initendpoint (); -} // end endpointer::endpointer - - -/* ENDPOINTER::~ENDPOINTER - class destructor */ -endpointer::~endpointer () -{ - delete []lastdpnoise; -} // end endpointer::~endpointer - - -/* ENDPOINT::INITENDPOINT - initialize the endpoint variables */ -void endpointer::initendpoint () -{ - long i; - - epstate = NOSILENCE; - noise = 0.0; - ave = 0.0; - begthresh = 0.0; - endthresh = begthresh; - energy = 0.0; - maxpeak = 0.0; - scnt = 0; - vcnt = 0; - evcnt = 0; - voicecount = 0; - zccnt = 0; - bscnt = 0; - startframe = 0; - endframe = 0; - avescnt = 0; - startsilenceok = false; - ncount = 0; - low = true; - for (i = 0; i < numdpnoise; i++) - lastdpnoise[i] = 0.0; -} // end endpointer::initendpoint - - -void endpointer::setnoise () -{ - dpnoise = lastdpnoise[1] = lastdpnoise[0]; - ncount = 2; -} // end endpointer::setnoise - - -/* ENDPOINT::AVERAGENOISE - get average background noise level and - * shift noise array */ -void endpointer::averagenoise () -{ - long i; - - for (dpnoise = 0.0, i = ncount - 1; i > 0; i--) - { - dpnoise += lastdpnoise[i]; - lastdpnoise[i] = lastdpnoise[i - 1]; - } - dpnoise = (dpnoise + lastdpnoise[0]) / ncount; - if (ncount < numdpnoise) - ncount ++; -} // end endpointer::averagenoise - - -/* ENDPOINT::ZCPEAKPICK - get the zero cross count and average energy */ -void endpointer::zcpeakpick -( - short *samples // raw samples -) -{ - long i; - float sum, - trigger; - short *smp; - - for (sum = 0.0, i = 0, smp = samples; i < windowsize; i++, smp++) - sum += *smp * *smp; - peakreturn = (sqrt (sum / windowsize)); - lastdpnoise[0] = peakreturn; - - if (ncount == 0) - dpnoise = peakreturn; // initial value - trigger = dpnoise * triggerfact; // schmidt trigger band - - for (i = 0, zc = 0, smp = samples; i < windowsize; i++, smp++) - { - if (low) - { - if (*smp > trigger) - { // up cross - zc++; - low = false; // search for down cross - } - } - else - { - if (*smp < -trigger) - { // down cross - zc++; - low = true; // search for up cross - } - } - } -} // end endpointer::zcpeakpick - - -/* ENDPOINT::GETENDPOINT - get the endpoint tag for the raw samples - * The recognition system is designed to operate in real-time. That - * is, the search proceeds in parallel with input of the signal. The - * endpoint detection must, therefore, make a guess as to what the - * current sample is and correct errors that may have been made - * previously. */ -EPTAG endpointer::getendpoint -( - short *samples // raw samples -) -{ - float tmp; - - zcpeakpick (samples); // get zc count and peak energy - if (peakreturn > maxpeak) - { - maxpeak = peakreturn; - if ((tmp = maxpeak / endfact) > endthresh) - endthresh = tmp; - } - - switch (epstate) - { - case NOSILENCE: // start, get background silence - ave += peakreturn; - if (++scnt <= 3) - { // average 3 frame's worth - if (scnt == 1) - setnoise (); - else - averagenoise (); - if (dpnoise < minstartsilence) - { - startsilenceok = true; - ave += peakreturn; - avescnt++; - } - return (EP_SILENCE); - } - if (!startsilenceok) - { - epstate = START; - return (EP_NOSTARTSILENCE); - } - ave /= avescnt; - noise = ave; - begthresh = noise + begfact; - endthresh = begthresh; - mnbe = noise * energyfact; - epstate = INSILENCE; - return (EP_SILENCE); - - case INSILENCE: - ave = ((3.0 * ave) + peakreturn) / 4.0; - if (peakreturn > begthresh || zc > zcthresh) - { // looks like start of signal - energy += peakreturn - noise; - if (zc > zcthresh) - zccnt++; - if (peakreturn > begthresh) - voicecount++; - if (++vcnt > minrise) - { - scnt = 0; - epstate = START; // definitely start of signal - } - return (EP_SIGNAL); - } - else - { // still in silence - energy = 0.0; - if (ave < noise) - { - noise = ave; - begthresh = noise + begfact; - endthresh = begthresh; - mnbe = noise * energyfact; - } - if (vcnt > 0) - { // previous frame was signal - if (++bscnt > startblip || zccnt == vcnt) - { // Oops, no longer in the signal - noise = ave; - begthresh = noise * begfact; - endthresh = begthresh; - mnbe = noise * energyfact; - vcnt = 0; - zccnt = 0; - bscnt = 0; - voicecount = 0; - startframe = 0; - return (EP_RESET);// not in the signal, ignore previous - } - return (EP_SIGNAL); - } - zccnt = 0; - return (EP_SILENCE); - } - - case START: - if (peakreturn > begthresh || zc > zcthresh) - { // possible start of signal - energy += peakreturn - noise; - if (zc > zcthresh) - zccnt++; - if (peakreturn > begthresh) - voicecount++; - vcnt += scnt + 1; - scnt = 0; - if (energy > mnbe || zccnt > minfriclng) - { - epstate = INSIGNAL; - return (EP_INUTT); - } - else - return (EP_SIGNAL); - } - else - if (++scnt > maxpause) - { // signal went low again, false start - vcnt = zccnt = voicecount = 0; - energy = 0.0; - epstate = INSILENCE; - ave = ((3.0 * ave) + peakreturn) / 4.0; - if (ave < noise + begfact) - { // lower noise level - noise = ave; - begthresh = noise + begfact; - endthresh = begthresh; - mnbe = noise * energyfact; - } - return (EP_RESET); - } - else - return (EP_SIGNAL); - - case INSIGNAL: - if (peakreturn > endthresh || zc > zcthresh) - { // still in signal - if (peakreturn > endthresh) - voicecount++; - vcnt++; - scnt = 0; - return (EP_SIGNAL); - } - else - { // below end threshold, may be end - scnt++; - epstate = END; - return (EP_MAYBEEND); - } - - case END: - if (peakreturn > endthresh || zc > zcthresh) - { // signal went up again, may not be end - if (peakreturn > endthresh) - voicecount++; - if (++evcnt > endblip) - { // back in signal again - vcnt += scnt + 1; - evcnt = 0; - scnt = 0; - epstate = INSIGNAL; - return (EP_NOTEND); - } - else - return (EP_SIGNAL); - } - else - if (++scnt > maxipause) - { // silence exceeds inter-word pause - if (vcnt > minuttlng && voicecount > minvoicelng) - return (EP_ENDOFUTT);// end of utterance - else - { // signal is too short - scnt = vcnt = voicecount = 0; - epstate = INSILENCE; - return (EP_RESET); // false utterance, keep looking - } - } - else - { // may be an inter-word pause - if (peakreturn == 0) - return (EP_ENDOFUTT);// zero filler frame - evcnt = 0; - return (EP_SIGNAL); // assume still in signal - } - } -} // end endpointer::getendpoint - - -/* ENDPOINT::PRINTVARS: Print variable values */ -void endpointer::printvars () -{ - cout << "endpoint variables:" << endl; - cout << " begfact " << begfact << endl; - cout << " endblip " << endblip << endl; - cout << " endfact " << endfact << endl; - cout << " energyfact " << energyfact << endl; - cout << " maxipause " << maxipause << endl; - cout << " maxpause " << maxpause << endl; - cout << " minfriclng " << minfriclng << endl; - cout << " minrise " << minrise << endl; - cout << " minstartsilence " << minstartsilence << endl; - cout << " minuttlng " << minuttlng << endl; - cout << " minvoicelng " << minvoicelng << endl; - cout << " numdpnoise " << numdpnoise << endl; - cout << " samprate " << samprate << endl; - cout << " startblip " << startblip << endl; - cout << " stepsize " << stepsize << endl; - cout << " triggerfact " << triggerfact << endl; - cout << " windowsize " << windowsize << endl; - cout << " zcthresh " << zcthresh << endl; -} // end endpointer::printvars - - -/* ENDPOINT::GETTAGNAME - convert the tag to ascii */ -const char *endpointer::gettagname -( - EPTAG tag -) -{ - static const char *tagnames[] = // must match EPTAG enum in endpoint.h - { - "NONE", - "RESET", - "SILENCE", - "SIGNAL", - "INUTT", - "MAYBEEND", - "ENDOFUTT", - "NOTEND", - "NOSTARTSILENCE" - }; - long ntag = long (tag); - - if (ntag < 0 || ntag > long (EP_NOSTARTSILENCE)) - return ("UNKNOWN"); - else - return (tagnames[ntag]); -} // end endpointer::gettagname -
--- a/main/audio/src/endpoint.h Sat Jul 26 20:53:49 2014 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,135 +0,0 @@ -// Author: Bruce T. Lowerre -// This program is granted to the public domain. - -/* - * ENDPOINT.H - endpoint class definition - * - * The endpointer is used to determine the start and end of a live - * input signal. Unlike a pre-recorded utterance, a live input signal - * is open-ended in that the actual start and end of the signal is - * totally unknown. The search, using HMM techniques with a silence - * model, will usually do a fairly good job of guessing the start of - * the signal. However, the actual end of the signal is unknown to - * the recognizer. Reaching the end state in the recognizer does not - * necessarily mean the end of signal. Therefore, the end of signal - * must be calculated by some means. This is the job of the end point - * detector. */ - -#ifndef ENDPOINT_H -#define ENDPOINT_H - -//#include <general.h> // contains general defs - -typedef enum -{ - NOSILENCE, - INSILENCE, - START, - INSIGNAL, - END -} EPSTATE; - -typedef enum -{ - EP_NONE, - EP_RESET, - EP_SILENCE, - EP_SIGNAL, - EP_INUTT, - EP_MAYBEEND, - EP_ENDOFUTT, - EP_NOTEND, - EP_NOSTARTSILENCE -} EPTAG; - -class endpointer -{ - private: - EPSTATE epstate; - float ave, - noise, - begthresh, - energy, - maxpeak, - endthresh, - begfact, - endfact, - energyfact, - mnbe, - peakreturn, // average energy - dpnoise, - triggerfact, // schmidt trigger percent - minstartsilence, - *lastdpnoise; // array of size numdpnoise - long samprate, // sampling rate in Hz - windowsize, // window size in samples - stepsize, // step size in samples - scnt, - avescnt, - vcnt, - evcnt, - voicecount, - minfriclng, - bscnt, - zccnt, - startframe, - endframe, - ncount, - zcthresh, - numdpnoise, - minrise, - maxpause, - maxipause, - startblip, - endblip, - minuttlng, - minvoicelng, - zc; // zero cross count per window - bool startsilenceok, - low; // is signal currently low or high? - void zcpeakpick // get zc count and average energy - ( - short* // raw samples - ); - void setnoise (); // initial noise level set - void averagenoise (); // average noise array and shift - public: - endpointer // constructor - ( - long, // sampling rate in Hz - long, // window size in samples - long, // step size in samples - long = 700, // endof utt silence default, msec - long = 100, // minuttlng default, msec - long = 600, // zcthresh default, Hz - float = 40.0, // begfact default - float = 80.0, // endfact default - float = 200.0, // energyfact default - float = 2000.0, // minstartsilence default - float = 3.0, // triggerfact default - long = 6, // numdpnoise default - long = 50, // minfriclng default, msec - long = 150, // maxpause default, msec - long = 30, // startblip default, msec - long = 20, // endblip default, msec - long = 60, // minvoicelng default, msec - long = 50 // minrise default, msec - ); - ~endpointer (); // destructor - - void initendpoint (); // initialize variables - EPTAG getendpoint - ( - short* // raw samples of window size - ); - const char *gettagname // convert tag to ascii - ( - EPTAG - ); - void printvars (); // print variables - long getzc () {return (zc);} // get the zero cross count - float getenergy () {return (peakreturn);} // get the RMS energy -}; // end class endpointer - - -#endif