Menu

ILA - teachable voice assistant / Blog: Recent posts

ILA Beta v3.8 with more context :-)

Hello everybody! It's been a while since the last update but finally here it is: ILA Beta v3.8! :-)

The most obvious first, ILA got some fresh new look ^^. Hope you like it! (if not don't worry you can always go back to classic). The more subtle changes are various improvements in the Add-ons and finally ILA got some context ... I mean commands that depend on context :-) Here is a more detailed patch list:... read more

Posted by Florian 2015-06-16

ILA Beta v3.7 comes with freedom updates :-)

Since some time now ILA works pretty reliable using the Sphinx-4 (offline) engine to recognize speech. This works mainly by restricting the vocabulary to obtain good recognition results. Restriction means a working system in this case ... but let's face it: we want FREEDOM! because fun comes with a large vocabulary :-) Up till now Google was able to give us this freedom but it required an API key (complicated to get) and came with restrictions of usage :-( ... so no real freedom. But thanks to the wonderful web API of Google and a nice technique called websockets we are finally free of restrictions now! Of cause that also means Google is free to gather more information on us while we use their services! =)
Besides freedom ILA beta 3.7 comes with a hand full of other neat features, here is the (approx.) complete list:... read more

Posted by Florian 2015-04-25

Major updates in ILA Beta 3.6

Welcome to Beta v3.6 the next major update of ILA!

This time the emphasis lies on more content and improved Pocketsphinx support. Pocketsphinx is the lightweight speech recognition engine of CMU that is optimized for mobile devices and single-board computers like the Raspberry Pi(2). It also comes with a "keyword search" mode giving the user an alternative to Sphinx-4 when using "Hey ILA" to activate the assistant.
Here is a more detailed list of what's new:... read more

Posted by Florian 2015-04-04

ILA Beta v3.5 - fixes for Linux and first try on pocketsphinx integration

I noticed some nasty bugs in the Linux (Ubuntu 14.04) version of ILA mainly coming from conflicts between Minim, Java an PulseAudio. In the worst case ILA could not speak anymore or was crashing a lot. I fixed what I could but if you still have problems I strongly recommend to install the latest Oracle Java 8. Together with the bug fixing I've also included some new features :-)

  • Integration of Pocketsphinx command line tool to better support low performance systems. You need to have Pocketsphinx installed (Linux) or put a pre compiled version (Win) in the subfolder SpeechData/Pocketsphinx (a Win8 version is included) to use it. Unfortunately I couldn't get the keyphrase spotter running yet. There is a config file for Pocketsphinx too in case you want to add parameters (SpeechData/default.pocketsphinx.config) (see pocketsphinx tutorial)
  • Added grammar/non-grammar switching for pocketsphinx too and improved it to work better in combination with 'hey ILA' (if you've deactivated grammar restrictions completely)
  • I've completely rewritten the 3rd layer of input analysis (1st is check of teachIt memory-file, 2nd is keyword isolation). If the 1st and 2nd layer fail ILA will try to do an approximate match to the language-knowledge-base that means everything inside teachIt.txt and languageModelBase.txt. The approximate match is done with (kind of) an edit-distance. The threshold for an approximate match can be set in the config file (Data/config.properties -> approxSearchErrorRateThresh (basically WER))
  • For people experimenting with 'addons' there is a new method "ILA_speechControl.askDirectQuestion("whaaaat?");" that can be used to let ILA ask direct questions from anywhere inside the code. The answer is obtained by checking ILA.lastInput. Check the 'batchaccuracytest' addon for an example. ... read more
Posted by Florian 2015-03-09

Shiny and new: ILA Beta v3.4 :-)

Hello everybody,

just a few days after the release of Beta v3.3 I'm happy to present you v3.4 already :-) I needed to fix some bugs and took the chance to include a bunch of improvements too!
Here is the (almost) complete list of changes:

  • fixed the timeout bug in the system 'test' command and made it a bit more fancy :-)
  • added the contacts list to the automatic creation of the dynamic language model (dlm) (yes! there is a contacts.txt list ^^)
  • removed any numbers from 'App'-names during auto loading into the dlm, ILA is not very flexible with number-to-string conversion yet :-(
  • added the possibility to correct (delete) what you have said by saying "I repeat" (de: ich wiederhole) or "I said" (de: ich sagte) followed by a short pause. So when you know you messed up an input just say "-pause- 'I repeat' -pause-" :-) This works especially nice in the Live-mode! (only Sphinx-4)
  • added some more ILA comments when the program needs to reload stuff so you know now that you have to wait a bit ;-)
  • added some tooltips to settings (especially for the selection of the default recognizer)
  • finally fixed saving and loading of the speaker adaptation data for good (it works reliable now with all tested models)
  • added the PTM 8kHz acoustic model to the default set of models. I recommend to try this one if your accuracy is rather low. To use it please adjust the 'acoustic model' in settings.
  • auto-loading the sample rate of the acoustic models by placing a 'samplerate.properties'-file inside the folder of the AM (see included models)
  • added a 'test accuracy' command to test again the accuracy of the recorded speech in Data\test.wav (created during speaker adaptation)
  • added a 'batch test' command as an addon to test a bunch of .wav-files recorded and saved with transcription. You can use the 'amt' (acoustic model training) command to record these files
  • ILA saves the speech recorded during the system test ('test') now and uses that to initialize the recognizer (usually the first sentences where always crap somehow oO ^^)
  • included updates in Sphinx-4 (LiveCMN and BatchCMN improves the recognizer? small case dictionary)
  • completely rewrote the Google speech recognition part to get rid of old bugs and dependencies and removed the old API
  • more bug fixing... read more
Posted by Florian 2015-02-28

May I present to you: ILA Beta v3.3 :-)

This is ILA Beta v3.3 the next big step forward!

ILA has just become even more customizable, more reliable, faster and smarter!
Here is a list of what's new:

  • ILA has been updated to support the open source Text-to-Speech System MaryTTS this means basically 2 things: ILA is completely free from any cloud service now (if you want) aaaand you can add new voices yeah! :-)

  • Beta v3.3 introduces a new dynamic language model, something I really enjoy! It means even the grammar-free mode can learn now everything you teach ILA and the program is getting better in recognizing these commands every time. One major step away from grammar restrictions to natural language recognition.... read more

Posted by Florian 2015-02-20

Welcome to ILA Beta v3.2!

I'm happy to announce the release of ILA Beta v3.2! :-)

for older patch-notes please visit the ILA homepage

The main focus lies on improving grammar-free speech recognition accuracy with Sphinx-4 bringing ILA one step closer to becoming independant of Googles's Speech API. Here are the recent patch notes:

  • updated to the most recent version of Sphinx-4 with support for the new PTM acoustic models
  • included the new CMU Sphinx en-us acoustic model (non-PTM) with greatly improved accuracy
  • added MLLR unsupervised speaker adaptation (type 'ussa' in ILA's input field) and auto-loading of MLLR_matrix files when added to the acoustic model folder
  • added a GUI to help you train your acoustic models (type 'amt' in ILA's input field) (tutorial soon)
  • the pre-rec recognizer (settings->ILA speech engine->Sphinx-4 offline (rec)) works much better now with in the grammar-free mode (settings->use grammar->red(off)).
    Note: the LiveSpeechRecognizer (Sphinx-4 offline (live)) only works reliable with grammar turned on, I'm trying to find the problem!
  • to be able to use grammar-free mode I've added two simple language models for 'en' and 'de'
  • new setting that allows grammar + non-grammar mixing (settings->use grammar on ILA question) that means ILA will switch back to grammar mode when you specifically told her to in e.g. an 'open parameter' command
  • 'open parameter' commands have been improved to filter user-specified words to prevent things like "play some musik of of Jimi Hendrix" (tutorial available soon)
  • added a button for the audio samplerate to the settings (if you want to use 8kHz acoustic models) and fixed a bug that actually prevented switching to anything else than 16kHz
  • when adding new commands to the grammar and ILA's memory (teachit_xy.txt) ILA checks now if this command already exists and replaces the old one instead of adding a 'dead' command to the end of the files
  • added a bunch of Icons to the windows shown in the taskbar (windows) and an updated manifest file (for windows start screen)
  • many more or less visible tweaks to the UI and as usual bugfixing (yes it's still a beta ;-) ) e.g. fixing problems with unsupported translucency and buttons in the Mac version... read more
Posted by Florian 2015-01-29 Labels: Speech Recognition Intelligent Agents Voice Assistant AI
MongoDB Logo MongoDB