Differences

This shows you the differences between the selected revision and the current version of the page.

voiceglue_0.12_user_guide 2010/05/14 05:35 current
Line 1: Line 1:
 +====== Verifying Voiceglue Installation ======
 +
 +After a successful installation, three new services will
 +be available to run:
 +
 +  * dynlog
 +  * phoneglue
 +  * voiceglue
 +
 +They cannot be started successfully, however, until Asterisk and
 +Voiceglue are properly configured.
 +
 +====== Configuring Asterisk ======
 +
 +The phoneglue service needs to log in to the Asterisk manager with
 +username "phoneglue" and password "phoneglue" (configurable with
 +command-line arguments), so configure Asterisk's manager.conf with an
 +entry like this for Asterisk 1.4:
 +
 +      [phoneglue]
 +      secret=phoneglue
 +      read = system,call,log,verbose,command,agent,user
 +      write = system,call,log,verbose,command,agent,user
 +
 +and like this for Asterisk 1.6:
 +
 +      [phoneglue]
 +      secret=phoneglue
 +      read = system,call,log,verbose,command,agent,user,originate
 +      write = system,call,log,verbose,command,agent,user,originate
 +
 +Also make sure you have:
 +
 +      enabled=yes
 +
 +in the same file.
 +
 +===== Passing Calls and Values to Voiceglue =====
 +
 +The asterisk dialplan must be used to route calls to voiceglue.
 +The simplest way to do this would be a dialplan looking something like:
 +
 +      [phoneglue]
 +      exten => 1,1,Answer
 +      exten => 1,2,Agi(agi://localhost)
 +      exten => 1,3,Hangup
 +
 +Here, whenever a call is routed to context ''phoneglue'' extension ''1''
 +it will first get answered, then routed to voiceglue (the Agi command),
 +then hungup.
 +
 +Parameters can be passed to the VXML script like this:
 +
 +      exten => 1,1,Answer
 +      exten => 1,2,Set(vxmlarg=bar)
 +      exten => 1,3,Agi(agi://localhost/foo=${vxmlarg})
 +      exten => 1,4,Hangup
 +
 +Here, the value ''bar'' will be available in the VXML script as
 +the value of session-scoped variable session.connection.initargs.foo
 +and ''foo=bar'' will be passed as a URL-encoded argument to the
 +initial VXML script fetch.
 +As a larger example:
 +
 +      exten => 1,1,Answer
 +      exten => 1,2,Set(vxmlurl=http%3A%2F%2Fvgweb-laptop%2Fvxml%2Furlin.vxml)
 +      exten => 1,3,Set(vxmlarg=foo)
 +      exten => 1,4,Set(virthost=vgweb-laptop)
 +      exten => 1,5,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +      exten => 1,6,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +      exten => 1,7,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +      exten => 1,8,Hangup
 +
 +Here, 3 parameters are passed to the VXML script, ''url'', ''arg'',
 +and ''virthost'', and these are available in the script as
 +session.connection.initargs.url, session.connection.initargs.arg,
 +and session.connection.initargs.virthost.
 +The ''url'' parameter is special, as it is used as the URL of
 +the initial VXML page to fetch and run for the call.
 +If this is not specified here, voiceglue uses the definitions
 +in the /etc/voiceglue.conf file to determine the initial script.
 +Notice the percent-encoding of the url argument so that it
 +doesn't confuse the parameter parsing.  The URL
 +represented in the example is actually ''http://vgweb-laptop/vxml/urlin.vxml''.
 +For details on percent-encoding, see a reference such as
 +http://en.wikipedia.org/wiki/Percent_encoding
 +
 +Also notice that if you are going to pass another URL as a parameter, then you need to URL encode also the percent sign.
 +E.g.:
 +
 +        Set(vxmlurl=http%3A%2F%2Falt.com%2Fvxml%2Fdoit.vxml)
 +        Set(vxmlarg=http%253A%252F%252Fanother.site%252Fdot%252Fcom.vxml)
 +        Agi(agi://localhost/url=${vxmlurl}%26arg=${vxmlarg})
 +
 +If you are getting the url from an external source (SIP for instance), then you can encode it like this:
 +
 +        Set(uri_encoded=${URIENCODE(${BASE64_ENCODE(${external_uri})})})
 +        Agi(agi://localhost/url=${vxmlurl}%26uri=${uri_encoded})
 +
 +Remember to (base64) decode it in your script!
 +'
 +===== Retrieving Values From Voiceglue =====
 +
 +Notice this portion of the last example dialplan above:
 +
 +      exten => 1,5,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +      exten => 1,6,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +      exten => 1,7,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
 +
 +This shows three consecutive calls to voiceglue.
 +If the VXML script does not hang up the call with a <disconnect> tag,
 +the dialplan continues on upon termination of the script.
 +Additionally, if a namelist is passed to the <exit> tag in
 +the VXML script, then
 +those variable names and values will be set as channel variables
 +in the asterisk dialplan.
 +Because of asterisk limitations on AGI syntax, these values should
 +be scalars with no special characters.
 +
 +====== Configuring Voiceglue ======
 +
 +Even if you use the ''uri'' parameter in the AGI command in your asterisk
 +dialplan, the ''/etc/voiceglue.conf'' file must be present and valid.  The
 +file ''/etc/voiceglue.conf'' containins the all-important definition of
 +ast_sound_dir (don't remove this!) and additional lines that
 +contain a whitespace-separated DNIS
 +(incoming number) and url pair per line.  Such a pair maps the
 +incoming phone number to that url to load.  As mentioned above, when passing
 +the ''uri'' parameter to the AGI command from asterisk, this mapping is ignored.
 +
 +The ''/etc/voiceglue.conf'' file can be changed dynamically and
 +voiceglue will immediately notice
 +any changes.  The wildcard dnis ''*'' can be used to match anything that
 +isn't matched otherwise.
 +
 +So, an example /etc/voiceglue.conf could contain:
 +
 +<code bash>
 +        * http://localhost/vxml/welcome-audiofile.vxml
 +</code>
 +
 +This would result in all incoming calls being handled by the
 +welcome-audiofile.vxml script found at ''http://localhost/vxml/''.
 +
 +Additional parameters that affect the operation of voiceglue may
 +also be placed in the /etc/voiceglue.conf file.
 +The format of every parameter is:
 +
 +<code bash>
 +        parameter = value
 +</code>
 +
 +The whitespace on either side of the "=" is required.
 +
 +All of the parameters below are optional -- the default value is
 +used if it is not specified in /etc/voiceglue.conf:
 +
 +^ Parameter ^ Default ^ Meaning ^
 +| blind_xfer_method | transfer | The Asterisk method used to implement the VXML transfer tag, choices are "transfer" or "dial" |
 +| audio_fetch_retry | 60 | The number of seconds after an audio fetch fails to wait before retrying |
 +| audio_fetch_timeout | 7 | The default number of seconds to wait for audio to be retrieved from a source before timing out |
 +| audio_maxage | 300 | The default number of seconds that an audio cache entry remains valid |
 +| cache_purge_interval | 420 | The number of seconds between audio cache purges |
 +| cache_lastused_purge | 240 | The number of seconds of non-use that will cause an audio cache item to be purged |
 +| ssml_passthrough | 0 | If = 1, will pass all SSML markup to the TTS generator |
 +
 +====== Extra Logging ======
 +
 +By default, the dynlog program collects all logs from the phoneglue
 +and voiceglue processes.  It is not strictly required, but without it
 +you will be scouring multiple log files to find out what's happening.
 +The logs are written to /var/log/dynlog/dynlog.  Dynlog has a dynamic
 +log-level changing capability, so by running "dynlog_level 7" you will
 +get the full output from all voiceglue components.  Running
 +"dynlog_level 4" will get you back to a more sane level.  These levels
 +are identical to those used by syslog.  I recommend setting the level
 +to 7 (the highest) when you are trying to debug a problem with
 +voiceglue.  Note that you don't have to stop or re-start anything;
 +dynlog and its clients coordinate dynamically to achieve the
 +appropriate level of logging.  Note, also, that this could cause
 +massive performance changes if done on a loaded system.
 +
 +The VXML <log> markups appear in dynlog as well, and are assigned
 +log level 5.
 +
 +====== Starting Voiceglue ======
 +
 +After performing the above configuration steps and making sure
 +asterisk is running, start the voiceglue services by rebooting or
 +running as root:
 +
 +        /etc/init.d/dynlog start
 +        /etc/init.d/phoneglue start
 +        /etc/init.d/voiceglue start
 +
 +These services must always be brought up in this order (and
 +after Asterisk is running), and be brought down in the reverse
 +order.
 +
 +Once everything is up and stays up, you should be able to call
 +in and have the VXML file(s) specified in ''/etc/voiceglue.conf''
 +or in the arguments interpreted.
 +
 +
 +====== Audio file formats ======
 +
 +Voiceglue supports the following audio file formats:
 +
 +  * ulaw
 +  * alaw
 +  * slin (16-bit 8khz signed linear)
 +  * gsm
 +  * mp3
 +
 +Each of these is only supported in so far as the installed
 +Asterisk supports them.
 +
 +**WARNING**:  Some versions of Asterisk have a bug
 +in their implementation of mp3 support for the STREAM FILE
 +command.
 +Until this bug is fixed, voiceglue cannot play mp3s.
 +
 +The audio format of a file must be able to be determined by
 +voiceglue before it can be used.
 +Voiceglue first checks the Content-Type returned
 +by the HTTP server that supplied the audio data.
 +The supported Content-Type fields are:
 +
 +^ Content-Type ^ Audio Format ^
 +| audio/basic | ulaw |
 +| audio/x-alaw-basic | alaw |
 +| audio/x-wav | slin |
 +| audio/x-gsm | gsm |
 +| audio/mpeg | mp3 |
 +
 +If the Content-Type field is not defined, or is returned
 +as text/plain (which is common if your web server is not
 +configured with the proper content type mapping for the
 +audio file extensions), then voiceglue attempts to determine
 +the audio file type by the filename extension.
 +The supported extensions are:
 +
 +^ Extension ^ Audio Format ^
 +| .ulaw | ulaw |
 +| .au | ulaw |
 +| .pcm | ulaw |
 +| .ul | ulaw |
 +| .mu | ulaw |
 +| .alaw | alaw |
 +| .al | alaw |
 +| .wav | slin |
 +| .gsm | gsm |
 +| .mp3 | mp3 |
 +
 +=====  Audio Streaming  =====
 +
 +The VXML specification does not require audio streaming.
 +It implies that audio fetches are finite, and requires
 +only that an implementation start playing audio after the
 +resource has been completely fetched.
 +The specification does permit for an "optimization" whereby an implementation
 +can start playing audio before it has been completely fetched,
 +but voiceglue does not perform this optimization.
 +
 +====== Audio Caching ======
 +
 +Voiceglue employs a shared audio caching mechanism
 +that provides significant performance gains when
 +multiple calls use the same audio data.
 +
 +All audio data, whether downloaded from an HTTP
 +server or generated by TTS, is cached by default
 +in the filesystem that is shared between voiceglue
 +and asterisk.
 +This cached audio data is used for all calls until
 +it expires based on the HTTP headers returned from
 +the web server or lack of use by the application.
 +
 +Voiceglue never uses stale audio data.
 +Thus, the VXML maxstale attribute and audiomaxstale
 +property have no effect.
 +
 +If the script author desires to prevent caching,
 +the VXML maxage attribute or
 +the audiomaxage property can be set to 0.
 +This will force a non-shared and non-reusable
 +audio fetch for that instance.
 +
 +It is not possible to prevent caching by returning
 +an HTTP header value that disables caching, such
 +as Expires, Cache-Control:no-cache, or
 +Cache-Control:private.
 +While these values will prevent further use of the
 +audio data returned, it will not prevent the sharing
 +of audio requests from other calls that have been
 +generated prior to
 +retrieving this result.
 +For this reason, the VXML maxage attribute or
 +the audiomaxage property are the only reliable
 +means of fully disabling caching.
 +
 +Two audio data requests are considered sharable
 +from the same cache entry if they reference the
 +same URL (including all parameters **and cookies**),
 +or if they specify the same TTS request.
 +
 +===== Cache Expiry =====
 +
 +Cached audio data in the shared directory will get
 +removed when it has not been used by any application
 +for some period of time.
 +Currently voiceglue checks for expiry of cache items
 +every 7 minutes (configurable by the cache_purge_interval
 +parameter in /etc/voiceglue.conf), and removes those that have gone
 +unused for the last 4 minutes (configurable by the
 +cache_lastused_purge parameter in /etc/voiceglue.conf).
 +There is no way currently to set a maximum cache
 +size for voiceglue; this may be implemented in
 +the future.
 +
 +====== Audio fetch cookie handling ======
 +
 +As mentioned above, cookies are implict parameters to
 +audio fetch requests.
 +Thus, if one call's document fetches set a cookie to
 +a value, and another call's document fetches set that
 +cookie to a different value (or create a different set of cookies),
 +then even though they request audio from the same URL
 +they will not share the audio.
 +Because of this, it is important to realize the negative
 +effects on caching that cookies can have, and to
 +not use cookies haphazardly.
 +
 +Although cookies can be set by document, script, or
 +grammar fetches, they cannot be set by audio fetches.
 +They are, however, always provided on all fetches, including
 +audio fetches.
 +The shared caching of audio fetches makes the setting
 +of cookies from audio fetches often counterintuitive.
 +It could be argued that there are cases where it still
 +should be allowed, for example when shared caching is
 +explicitly prohibited with maxage=0, but this is not
 +currently implemented.
 +
 +====== Alternate TTS ======
 +
 +Using an alternate TTS implementation should be fairly
 +straightforward.  Every time a TTS is required, voiceglue
 +runs the ''/usr/bin/voiceglue_tts_gen'' program with four
 +arguments.  The first is -t (and can be ignored),
 +the second is the text to create, the third
 +is the file in which to place the audio,
 +and the fourth is the language as specified
 +by the VXML's ''xml:lang'' setting.
 +This generated audio file must be in 16-bit 8kHz PCM wav format
 +with a riff header.
 +
 +The default implementation of voiceglue_tts_gen for flite
 +is:
 +
 + #!/usr/bin/perl --      -*-CPerl-*-
 + $file = $::ARGV[2];
 + system ("flite", @::ARGV[0..2]);
 + system ("mv", $file, $file . ".16khz.wav");
 + system ("sox", $file . ".16khz.wav", "-r", "8000", $file);
 +
 +The last two lines convert the format from flite's default
 +(on Ubuntu) output of 16khz wav to 8khz.
 +
 +If you want to use Cepstral, this voiceglue_tts_gen file has
 +worked:
 +
 + #!/usr/bin/perl --      -*-CPerl-*-
 + # Cepstral interface
 + $file = $::ARGV[2];
 + system ("/usr/local/bin/swift", "-m", "text", "-o" , $file , $ARGV[1]);
 +
 +====== Transfer ======
 +
 +Voiceglue has rudimentary support for the <transfer> tag in VXML.
 +It only supports blind transfers.
 +There are two different Asterisk mechanisms that may be used
 +to implement the transfer, the "transfer" command and the "dial"
 +command.
 +The "transfer" command is the default and correctly returns control
 +to the VXML script immediately, but is a less reliable Asterisk command.
 +The "dial" command is more reliable in Asterisk, but does not return
 +control to the VXML script until the transfered call disconnects or fails.
 +
 +These choices are controlled by the blind_xfer_method parameter
 +in the /etc/voiceglue.conf file.
 +
 +====== Example VXML Files ======
 +
 +The directory "examples" here contains some example VXML files that
 +work with voiceglue.  Keep in mind that there is much (some say too
 +much) latitude in the VXML specification as to what could be
 +supported, so not all VXML files will run without modification.
 +Specifically, voiceglue only supports simple SRGS XML DTMF grammars,
 +and no speech input (but working on it).
 +
 +Here are what the example files do:
 +
 + welcome-tts.vxml          -- Speaks "Welcome" in TTS
 +
 + welcome-audiofile.vxml    -- Recorded audio of Allison saying "Welcome"
 +
 + single-digit-input.vxml    -- Repeatedly gets and speaks a single digit
 +
 + menu-input.vxml            -- Repeatedly gets a menu input
 +
 + multi-digit-input.vxml    -- Repeatedly gets and speaks multiple digits
 +
 + record-audio.vxml          -- Repeatedly records audio from the caller
 +
 
voiceglue_0.12_user_guide.txt · Last modified: 2010/05/14 05:35 by soup
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Debian Driven by DokuWiki