Skinny Satan

A cool Nerd To Hangout With

Speech Recognition with AVR/Atmega

Speech recognition or Voice Recognition is not an easy thing to do. You all must have seen people struggling with advanced voice recognition tools like siri even though it is being developed by worlds best engineers. Still in this article i will show you how to recognize between a limited set of keywords. And trust me it is very fun to do an on board Speech recognition instead of connecting it to the Computer and using its huge processing power when our task isn’t huge. I was attempting to do this totally on-chip with my Atmega128. but Alas!! it has very limited processing power plus it is not an MSP to process such signals. So instead i moved on to do it with an external Voice Recognition module.

(Note: Main Code is way below, its an .odt file download it and rename it to vr.zip and extract it– sorry for wordpress Restrictions)

EasyVR2.2_-3

EasyVR is a voice recognition module by veear co. from TIGAL corps. It has this Serial interface to be connected, an EasyVr commander to train the commands and everything you need to do the recognition part. In this tutorial I will show you how to interface EasyVr with your AVR based embedded project.

Check 1
– Buy an EasyVR
Its available everywhere, check sparkfun if you cant find it.

Check 2– Connect it to your system
Now you need to connect this EasyVr with your system but before that install EasyVr Commander software from Veear. http://www.veear.eu/downloads/
after that connect easyVr to your serial port as per the following diagram. If your system don’t have a serial port, buy a USB to serial (TTL) converter.

easyvr-connection

Check 3– Train it
Open EasyVr commander, choose the port it is connected to and press connect.
now choose the group 1, and start adding your commands. you can do it up to 32 commands
when done, press train for every command and speak in that microphone. After all the commands are done, you are done. (do check for the index of the commands on the left hand side here, you will get those values in return after successful recognition)

Check 4
– Interface it with your Embedded system.
Do it same as shown in above diagram

Check 5– Burn the code in your uC
Burn the code and when is done it will print L on your LCD and there you should speak one of those keywords you fed into that system. When done listening it will print D and display the index of the command on LCD.
manipulate it as per your application

How does it work—

Well, as this one is a stamp it has its own protocol to follow, to initiate request and acknowledge everything, It is done well on the above code on library side so it wont be a problem from user side.

Uart-easyvr

The Stamp works on UART protocol with 8bit, 1 stop bit and 0 parity. We have covered it here, so check it for reference.
here are these functions from the library i ported.

*detect()— checks if the EasyVr is connected
*setLanguage()— Sets input language, 0 default for English
*setTimeout()— Sets timeout for voice input
*recognizeCommand()— Pass group number of your command set and wait till recognised
*hasFinished()— Waits till the recognition is done, also gives out recognised value in variable _value
*getCommand()— retrieves the result from above function

Actually check this code below, you will get its functionality.

EasyVR library for all Atmega/AVR Microcontrollers
vr.h

#define CMD_BREAK       'b' // abort recog or ping
#define CMD_SLEEP       's' // go to power down
#define CMD_KNOB        'k' // set si knob 
#define CMD_LEVEL       'v' // set sd level 
#define CMD_LANGUAGE    'l' // set si language 
#define CMD_TIMEOUT     'o' // set timeout 
#define CMD_RECOG_SI    'i' // do si recog from ws 
#define CMD_TRAIN_SD    't' // train sd command at group  pos 
#define CMD_GROUP_SD    'g' // insert new command at group  pos 
#define CMD_UNGROUP_SD  'u' // remove command at group  pos 
#define CMD_RECOG_SD    'd' // do sd recog at group  (0 = trigger mixed si/sd)
#define CMD_ERASE_SD    'e' // reset command at group  pos 
#define CMD_NAME_SD     'n' // label command at group  pos  with length  name 
#define CMD_COUNT_SD    'c' // get command count for group 
#define CMD_DUMP_SD     'p' // read command data at group  pos 
#define CMD_MASK_SD     'm' // get active group mask
#define CMD_RESETALL    'r' // reset all commands and groups
#define CMD_ID          'x' // get version id
#define CMD_DELAY       'y' // set transmit delay  (log scale)
#define CMD_BAUDRATE    'a' // set baudrate  (bit time, 1=>115200)
#define CMD_QUERY_IO    'q' // configure, read or write I/O pin  of type 
#define CMD_PLAY_SX     'w' // wave table entry  (10-bit) playback at volume 
#define CMD_PLAY_DTMF   'w' // play (=-1) dial tone  for duration 
#define CMD_DUMP_SX     'h' // dump wave table entries
#define CMD_DUMP_SI     'z' // dump si settings for ws  (or total ws count if -1)
#define CMD_SEND_SN     'j' // send sonicnet token with bits  index  at time 
#define CMD_RECV_SN     'f' // receive sonicnet token with bits  rejection  timeout 

#define STS_MASK        'k' // mask of active groups 
#define STS_COUNT       'c' // count of commands  (or number of ws )
#define STS_AWAKEN      'w' // back from power down mode
#define STS_DATA        'd' // provide training , conflict , command label  (counted string)
#define STS_ERROR       'e' // signal error code 
#define STS_INVALID     'v' // invalid command or argument
#define STS_TIMEOUT     't' // timeout expired
#define STS_INTERR      'i' // back from aborted recognition (see 'break')
#define STS_SUCCESS     'o' // no errors status
#define STS_RESULT      'r' // recognised sd command  - training similar to sd 
#define STS_SIMILAR     's' // recognised si  (in mixed si/sd) - training similar to si 
#define STS_OUT_OF_MEM  'm' // no more available commands (see 'group')
#define STS_ID          'x' // provide version id 
#define STS_PIN         'p' // return pin state 
#define STS_TABLE_SX    'h' // table entries count  (10-bit), table name  (counted string)
#define STS_GRAMMAR     'z' // si grammar: flags , word count , labels...  (n counted strings)
#define STS_TOKEN       'f' // received sonicnet token 

// protocol arguments are in the range 0x40 (-1) to 0x60 (+31) inclusive
#define ARG_MIN     0x40
#define ARG_MAX     0x60
#define ARG_ZERO    0x41

#define ARG_ACK     0x20    // to read more status arguments

#define DEF_TIMEOUT 100
#define WAKE_TIMEOUT 200
#define PLAY_TIMEOUT 5000
#define NO_TIMEOUT 0

      uint8_t _command = 1;
      uint8_t _builtin = 1;
      uint8_t _error = 1;
      uint8_t _timeout = 1;
      uint8_t _invalid = 1;
      uint8_t _memfull = 1;
      uint8_t _conflict = 1;
      uint8_t _token = 1;
	  uint8_t _status = 1;
	  
	  
  uint8_t _value , idx;
  

int isTimeout() { return _timeout; }

void send(uint8_t c)
{
  _delay_ms(1);
  write(c);
}

void sendCmd(int8_t c)
{
	_delay_ms(1);
	write(c);
}

void sendArg(int8_t c)
{
  send(c + ARG_ZERO);
}

void sendGroup(int8_t c)
{
  _delay_ms(1);
  write(c + ARG_ZERO);
  _delay_ms(19); // worst case time to cache a full group in memory
}

int recv(int16_t timeout) // negative means forever
{
  while (timeout != 0)
  {
    _delay_ms(1);
    if (timeout > 0)
      --timeout;
  }
  return read();
}

int recvArg(int8_t c, int16_t timeout)
{
  send(ARG_ACK);
  int r = recv(timeout);
  c = r - ARG_ZERO;
  return r >= ARG_MIN && r <= ARG_MAX;
}

/** Detects the connection is established or not**/
int detect()
{
  uint8_t i;
  for (i = 0; i < 5; ++i)
  {
    sendCmd(CMD_BREAK);

    if (recv(WAKE_TIMEOUT) == STS_SUCCESS)
      return 1;
  }
  return 0;
}

int stop()
{
  sendCmd(CMD_BREAK);

  uint8_t rx = recv(WAKE_TIMEOUT);
  if (rx == STS_INTERR || rx == STS_SUCCESS)
    return 1;
  return 0;
}

/** Sets Language for Recognition -- default 0 for English**/
int setLanguage(int8_t lang)
{        
  sendCmd(CMD_LANGUAGE);
  sendArg(lang);

  if (recv(DEF_TIMEOUT) == STS_SUCCESS)
    return 1;
  return 0;
}

/**Sets timeout-- default is 5 for 5 seconds**/
int setTimeout(int8_t seconds)
{
  sendCmd(CMD_TIMEOUT);
  sendArg(seconds);

  if (recv(DEF_TIMEOUT) == STS_SUCCESS)
    return 1;
  return 0;
}

int changeBaudrate(int8_t baud)
{
  sendCmd(CMD_BAUDRATE);
  sendArg(baud);

  if (recv(DEF_TIMEOUT) == STS_SUCCESS)
    return 1;
  return 0;
}

/**tells the Stamp to start recognising, a trigger command**/
void recognizeCommand(int8_t group)
{
  sendCmd(CMD_RECOG_SD);
  sendArg(group);
}

void recognizeWord(int8_t wordset)
{
  sendCmd(CMD_RECOG_SI);
  sendArg(wordset);
}

/*** Waits till command isnt recognised, also obtains the command**/
int hasFinished() 
{
  int8_t rx = recv(NO_TIMEOUT);
  if (rx < 0)
  {
    return 0;
  }
  _status = 0;
  lcd_gotoxy2(8);
  lcd_showvalue(rx);
  switch (rx)
  {
  case STS_SUCCESS:
    return 1;
  
  case STS_SIMILAR:
    _builtin = 1;
    goto GET_WORD_INDEX;

  case STS_RESULT:
    _command = 1;
  
  GET_WORD_INDEX:
   /* if (recvArg(rx, DEF_TIMEOUT))
    {
      _value = rx;
      return 1;
    }*/
	send(ARG_ACK);
	int r = recv(DEF_TIMEOUT);
	_value = r - ARG_ZERO;
    break;
    
  case STS_TOKEN:
    _token = 1;
  
    if (recvArg(rx, DEF_TIMEOUT))
    {
      _value = rx << 5;
      if (recvArg(rx, DEF_TIMEOUT))
      {
        _value |= rx;
        return 1;
      }
    }
    break;
    
  case STS_TIMEOUT:
    _timeout = 1;
    return 1;
    
  case STS_INVALID:
    _invalid = 1;
    return 1;
    
  case STS_ERROR:
    _error = 1;
    if (recvArg(rx, DEF_TIMEOUT))
    {
      _value = rx << 4;
      if (recvArg(rx, DEF_TIMEOUT))
      {
        _value |= rx;
        return 1;
      }
    }
    break;
  }

  // unexpected condition (communication error)
  _status = 0;
  _error = 1;
  return 1;
}

int8_t getCommand() { return _command ? _value : -1; }

int8_t getWord() { return _builtin ? _value : -1; }

Download the whole code here : VR.zip

The result value is returned in _value (and idx variable in main file) Use that index to manipulate and for using it in Other projects.

Have fun.

4 comments on “Speech Recognition with AVR/Atmega

  1. ramanda
    November 12, 2014

    Howto add playsound coding??help..
    or complete source…??

    • S4t4n
      November 12, 2014

      Playsound coding is done internally inside that module. Its not done anywhere on-chip.

      • ramanda
        November 12, 2014

        recognizeCommand(1);
        do
        {
        // can do some processing while waiting for a spoken command
        }
        while (!hasFinished());

        can you give me sample code with playsound?? I’m still a little not understand..
        thank before for you help…

  2. ramanda
    November 13, 2014

    i’m try with atmega32 nothing respon,
    i use easyvr v2 + shield…
    can you help me??

Leave a reply to S4t4n Cancel reply

Information

This entry was posted on May 4, 2014 by in random stuff.