Verifying Voiceglue Installation

After a successful installation, three new services will be available to run:

  • dynlog
  • phoneglue
  • voiceglue

They cannot be started successfully, however, until Asterisk and Voiceglue are properly configured.

Configuring Asterisk

The phoneglue service needs to log in to the Asterisk manager with username “phoneglue” and password “phoneglue” (configurable with command-line arguments), so configure Asterisk's manager.conf with an entry like this for Asterisk 1.4:

     [phoneglue]
     secret=phoneglue
     read = system,call,log,verbose,command,agent,user
     write = system,call,log,verbose,command,agent,user

and like this for Asterisk 1.6:

     [phoneglue]
     secret=phoneglue
     read = system,call,log,verbose,command,agent,user,originate
     write = system,call,log,verbose,command,agent,user,originate

Also make sure you have:

     enabled=yes

in the same file.

Passing Calls and Values to Voiceglue

The asterisk dialplan must be used to route calls to voiceglue. The simplest way to do this would be a dialplan looking something like:

     [phoneglue]
     exten => 1,1,Answer
     exten => 1,2,Agi(agi://localhost)
     exten => 1,3,Hangup

Here, whenever a call is routed to context phoneglue extension 1 it will first get answered, then routed to voiceglue (the Agi command), then hungup.

Parameters can be passed to the VXML script like this:

     exten => 1,1,Answer
     exten => 1,2,Set(vxmlarg=bar)
     exten => 1,3,Agi(agi://localhost/foo=${vxmlarg})
     exten => 1,4,Hangup

Here, the value bar will be available in the VXML script as the value of session-scoped variable session.connection.initargs.foo and foo=bar will be passed as a URL-encoded argument to the initial VXML script fetch. As a larger example:

     exten => 1,1,Answer
     exten => 1,2,Set(vxmlurl=http%3A%2F%2Fvgweb-laptop%2Fvxml%2Furlin.vxml)
     exten => 1,3,Set(vxmlarg=foo)
     exten => 1,4,Set(virthost=vgweb-laptop)
     exten => 1,5,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
     exten => 1,6,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
     exten => 1,7,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
     exten => 1,8,Hangup

Here, 3 parameters are passed to the VXML script, url, arg, and virthost, and these are available in the script as session.connection.initargs.url, session.connection.initargs.arg, and session.connection.initargs.virthost. The url parameter is special, as it is used as the URL of the initial VXML page to fetch and run for the call. If this is not specified here, voiceglue uses the definitions in the /etc/voiceglue.conf file to determine the initial script. Notice the percent-encoding of the url argument so that it doesn't confuse the parameter parsing. The URL represented in the example is actually http://vgweb-laptop/vxml/urlin.vxml. For details on percent-encoding, see a reference such as http://en.wikipedia.org/wiki/Percent_encoding

Also notice that if you are going to pass another URL as a parameter, then you need to URL encode also the percent sign. E.g.:

      Set(vxmlurl=http%3A%2F%2Falt.com%2Fvxml%2Fdoit.vxml)
      Set(vxmlarg=http%253A%252F%252Fanother.site%252Fdot%252Fcom.vxml)
      Agi(agi://localhost/url=${vxmlurl}%26arg=${vxmlarg})

If you are getting the url from an external source (SIP for instance), then you can encode it like this:

      Set(uri_encoded=${URIENCODE(${BASE64_ENCODE(${external_uri})})})
      Agi(agi://localhost/url=${vxmlurl}%26uri=${uri_encoded})

Remember to (base64) decode it in your script! '

Retrieving Values From Voiceglue

Notice this portion of the last example dialplan above:

     exten => 1,5,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
     exten => 1,6,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})
     exten => 1,7,Agi(agi://vgweb-laptop/url=${vxmlurl}&arg=${vxmlarg}&virthost=${virthost})

This shows three consecutive calls to voiceglue. If the VXML script does not hang up the call with a <disconnect> tag, the dialplan continues on upon termination of the script. Additionally, if a namelist is passed to the <exit> tag in the VXML script, then those variable names and values will be set as channel variables in the asterisk dialplan. Because of asterisk limitations on AGI syntax, these values should be scalars with no special characters.

Configuring Voiceglue

Even if you use the uri parameter in the AGI command in your asterisk dialplan, the /etc/voiceglue.conf file must be present and valid. The file /etc/voiceglue.conf containins the all-important definition of ast_sound_dir (don't remove this!) and additional lines that contain a whitespace-separated DNIS (incoming number) and url pair per line. Such a pair maps the incoming phone number to that url to load. As mentioned above, when passing the uri parameter to the AGI command from asterisk, this mapping is ignored.

The /etc/voiceglue.conf file can be changed dynamically and voiceglue will immediately notice any changes. The wildcard dnis * can be used to match anything that isn't matched otherwise.

So, an example /etc/voiceglue.conf could contain:

        * http://localhost/vxml/welcome-audiofile.vxml

This would result in all incoming calls being handled by the welcome-audiofile.vxml script found at http://localhost/vxml/.

Additional parameters that affect the operation of voiceglue may also be placed in the /etc/voiceglue.conf file. The format of every parameter is:

        parameter = value

The whitespace on either side of the ”=” is required.

All of the parameters below are optional – the default value is used if it is not specified in /etc/voiceglue.conf:

Parameter Default Meaning
blind_xfer_method transfer The Asterisk method used to implement the VXML transfer tag, choices are “transfer” or “dial”
audio_fetch_retry 60 The number of seconds after an audio fetch fails to wait before retrying
audio_fetch_timeout 7 The default number of seconds to wait for audio to be retrieved from a source before timing out
audio_maxage 300 The default number of seconds that an audio cache entry remains valid
cache_purge_interval 420 The number of seconds between audio cache purges
cache_lastused_purge 240 The number of seconds of non-use that will cause an audio cache item to be purged
ssml_passthrough 0 If = 1, will pass all SSML markup to the TTS generator

Extra Logging

By default, the dynlog program collects all logs from the phoneglue and voiceglue processes. It is not strictly required, but without it you will be scouring multiple log files to find out what's happening. The logs are written to /var/log/dynlog/dynlog. Dynlog has a dynamic log-level changing capability, so by running “dynlog_level 7” you will get the full output from all voiceglue components. Running “dynlog_level 4” will get you back to a more sane level. These levels are identical to those used by syslog. I recommend setting the level to 7 (the highest) when you are trying to debug a problem with voiceglue. Note that you don't have to stop or re-start anything; dynlog and its clients coordinate dynamically to achieve the appropriate level of logging. Note, also, that this could cause massive performance changes if done on a loaded system.

The VXML <log> markups appear in dynlog as well, and are assigned log level 5.

Starting Voiceglue

After performing the above configuration steps and making sure asterisk is running, start the voiceglue services by rebooting or running as root:

      /etc/init.d/dynlog start
      /etc/init.d/phoneglue start
      /etc/init.d/voiceglue start

These services must always be brought up in this order (and after Asterisk is running), and be brought down in the reverse order.

Once everything is up and stays up, you should be able to call in and have the VXML file(s) specified in /etc/voiceglue.conf or in the arguments interpreted.

Audio file formats

Voiceglue supports the following audio file formats:

  • ulaw
  • alaw
  • slin (16-bit 8khz signed linear)
  • gsm
  • mp3

Each of these is only supported in so far as the installed Asterisk supports them.

WARNING: Some versions of Asterisk have a bug in their implementation of mp3 support for the STREAM FILE command. Until this bug is fixed, voiceglue cannot play mp3s.

The audio format of a file must be able to be determined by voiceglue before it can be used. Voiceglue first checks the Content-Type returned by the HTTP server that supplied the audio data. The supported Content-Type fields are:

Content-Type Audio Format
audio/basic ulaw
audio/x-alaw-basic alaw
audio/x-wav slin
audio/x-gsm gsm
audio/mpeg mp3

If the Content-Type field is not defined, or is returned as text/plain (which is common if your web server is not configured with the proper content type mapping for the audio file extensions), then voiceglue attempts to determine the audio file type by the filename extension. The supported extensions are:

Extension Audio Format
.ulaw ulaw
.au ulaw
.pcm ulaw
.ul ulaw
.mu ulaw
.alaw alaw
.al alaw
.wav slin
.gsm gsm
.mp3 mp3

Audio Streaming

The VXML specification does not require audio streaming. It implies that audio fetches are finite, and requires only that an implementation start playing audio after the resource has been completely fetched. The specification does permit for an “optimization” whereby an implementation can start playing audio before it has been completely fetched, but voiceglue does not perform this optimization.

Audio Caching

Voiceglue employs a shared audio caching mechanism that provides significant performance gains when multiple calls use the same audio data.

All audio data, whether downloaded from an HTTP server or generated by TTS, is cached by default in the filesystem that is shared between voiceglue and asterisk. This cached audio data is used for all calls until it expires based on the HTTP headers returned from the web server or lack of use by the application.

Voiceglue never uses stale audio data. Thus, the VXML maxstale attribute and audiomaxstale property have no effect.

If the script author desires to prevent caching, the VXML maxage attribute or the audiomaxage property can be set to 0. This will force a non-shared and non-reusable audio fetch for that instance.

It is not possible to prevent caching by returning an HTTP header value that disables caching, such as Expires, Cache-Control:no-cache, or Cache-Control:private. While these values will prevent further use of the audio data returned, it will not prevent the sharing of audio requests from other calls that have been generated prior to retrieving this result. For this reason, the VXML maxage attribute or the audiomaxage property are the only reliable means of fully disabling caching.

Two audio data requests are considered sharable from the same cache entry if they reference the same URL (including all parameters and cookies), or if they specify the same TTS request.

Cache Expiry

Cached audio data in the shared directory will get removed when it has not been used by any application for some period of time. Currently voiceglue checks for expiry of cache items every 7 minutes (configurable by the cache_purge_interval parameter in /etc/voiceglue.conf), and removes those that have gone unused for the last 4 minutes (configurable by the cache_lastused_purge parameter in /etc/voiceglue.conf). There is no way currently to set a maximum cache size for voiceglue; this may be implemented in the future.

Audio fetch cookie handling

As mentioned above, cookies are implict parameters to audio fetch requests. Thus, if one call's document fetches set a cookie to a value, and another call's document fetches set that cookie to a different value (or create a different set of cookies), then even though they request audio from the same URL they will not share the audio. Because of this, it is important to realize the negative effects on caching that cookies can have, and to not use cookies haphazardly.

Although cookies can be set by document, script, or grammar fetches, they cannot be set by audio fetches. They are, however, always provided on all fetches, including audio fetches. The shared caching of audio fetches makes the setting of cookies from audio fetches often counterintuitive. It could be argued that there are cases where it still should be allowed, for example when shared caching is explicitly prohibited with maxage=0, but this is not currently implemented.

Alternate TTS

Using an alternate TTS implementation should be fairly straightforward. Every time a TTS is required, voiceglue runs the /usr/bin/voiceglue_tts_gen program with four arguments. The first is -t (and can be ignored), the second is the text to create, the third is the file in which to place the audio, and the fourth is the language as specified by the VXML's xml:lang setting. This generated audio file must be in 16-bit 8kHz PCM wav format with a riff header.

The default implementation of voiceglue_tts_gen for flite is:

#!/usr/bin/perl --       -*-CPerl-*-
$file = $::ARGV[2];
system ("flite", @::ARGV[0..2]);
system ("mv", $file, $file . ".16khz.wav");
system ("sox", $file . ".16khz.wav", "-r", "8000", $file);

The last two lines convert the format from flite's default (on Ubuntu) output of 16khz wav to 8khz.

If you want to use Cepstral, this voiceglue_tts_gen file has worked:

#!/usr/bin/perl --       -*-CPerl-*-
# Cepstral interface
$file = $::ARGV[2];
system ("/usr/local/bin/swift", "-m", "text", "-o" , $file , $ARGV[1]);

Transfer

Voiceglue has rudimentary support for the <transfer> tag in VXML. It only supports blind transfers. There are two different Asterisk mechanisms that may be used to implement the transfer, the “transfer” command and the “dial” command. The “transfer” command is the default and correctly returns control to the VXML script immediately, but is a less reliable Asterisk command. The “dial” command is more reliable in Asterisk, but does not return control to the VXML script until the transfered call disconnects or fails.

These choices are controlled by the blind_xfer_method parameter in the /etc/voiceglue.conf file.

Example VXML Files

The directory “examples” here contains some example VXML files that work with voiceglue. Keep in mind that there is much (some say too much) latitude in the VXML specification as to what could be supported, so not all VXML files will run without modification. Specifically, voiceglue only supports simple SRGS XML DTMF grammars, and no speech input (but working on it).

Here are what the example files do:

welcome-tts.vxml – Speaks “Welcome” in TTS

welcome-audiofile.vxml – Recorded audio of Allison saying “Welcome”

single-digit-input.vxml – Repeatedly gets and speaks a single digit

menu-input.vxml – Repeatedly gets a menu input

multi-digit-input.vxml – Repeatedly gets and speaks multiple digits

record-audio.vxml – Repeatedly records audio from the caller

 
voiceglue_0.12_user_guide.txt · Last modified: 2010/05/14 05:35 by soup
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Debian Driven by DokuWiki