ROS语音识别

ROS语音识别一、语音识别包1、安装安装很简单，直接使用ubuntu命令即可，首先安装依赖库：123$ sudo apt-get install gstreamer0.10-pocketsphinx$ sudo apt-get install ros-indigo-audio-common

博瓦

1240人浏览 · 2017-02-17 20:30:01

博瓦 · 2017-02-17 20:30:01 发布

ROS语音识别

一、语音识别包

1、安装

安装很简单，直接使用ubuntu命令即可，首先安装依赖库：

   
         $ sudo apt-get install gstreamer0.10-pocketsphinx 
        
         $ sudo apt-get install ros-indigo-audio-common         //我安装的是indigo版本的 
        
         $ sudo apt-get install libasound2

$ sudo apt-get install gstreamer0.10-gconf

然后来安装ROS包：

sudo apt-get install ros-indigo-pocketsphinx

2. 安装完成后我们就可以运行测试了。
首先，插入你的麦克风设备，然后在系统设置里测试麦克风是否有语音输入。
然后，运行包中的测试程序：

   
         $ roslaunch pocketsphinx robocup.launch

我们也可以直接看ROS最后发布的结果消息：

   
         $ rostopic echo /recognizer/output

二、语音库

1、查看语音库

这个语音识别时一种离线识别的方法，将一些常用的词汇放到一个文件中，作为识别的文本库，然后分段识别语音信号，最后在库中搜索对应的文本信息。如果想看语音识别库中有哪些文本信息，可以通过下面的指令进行查询：

   
         $ roscd rbx1_speech/config  
        
         $ more nav_commands.txt

这就是需要添加的文本，我们也可以修改其中的某些文本，改成自己需要的。
然后我们要把这个文件在线生成语音信息和库文件，这一步需要登陆网站http://www.speech.cs.cmu.edu/tools/lmtool-new.html，根据网站的提示上传文件，然后在线编译生成库文件。

上传nav_commands.txt文件

把下载的文件都解压放在rbx1_speech包的config文件夹下。我们可以给这些文件改个名字：

   
         rename -f  's/593/nav_commands/' *

在rbx1_speech/launch文件夹下看看voice_nav_commands.launch这个文件：

   
         <launch>
        
            <node name= "recognizer" pkg= "pocketsphinx" type= "recognizer.py" output= "screen" >
        
              <param name= "lm" value= "$(find rbx1_speech)/config/nav_commands.lm" />
        
              <param name= "dict" value= "$(find rbx1_speech)/config/nav_commands.dic" />
        
            </node>
        
         </launch>

可以看到，这个launch文件在运行recognizer.py节点的时候使用了我们生成的语音识别库和文件参数，这样就可以实用我们自己的语音库来进行语音识别了。
通过之前的命令来测试一下效果如何吧：

    $ roslaunch rbx1_speech voice_nav_commands.launch  
    $ rostopic echo /recognizer/output

三、语音控制

有了语音识别，我们就可以来做很多犀利的应用了，首先先来尝试一下用语音来控制机器人动作。

1、机器人控制节点

前面说到的recognizer.py会将最后识别的文本信息通过消息发布，那么我们来编写一个机器人控制节点接收这个消息，进行相应的控制即可。
在pocketsphinx包中本身有一个语音控制发布Twist消息的例程voice_cmd_vel.py，rbx1_speech包对其进行了一些简化修改，在nodes文件夹里可以查看voice_nav.py文件：

   
         #!/usr/bin/env python
        
         """
        
            voice_nav.py - Version 1.1 2013-12-20
        
            Allows controlling a mobile base using simple speech commands.
        
            Based on the voice_cmd_vel.py script by Michael Ferguson in
        
            the pocketsphinx ROS package.
        
            See http://www.ros.org/wiki/pocketsphinx
        
         """
        
         import rospy
        
         from  geometry_msgs.msg  import  Twist
        
         from  std_msgs.msg  import  String
        
         from  math  import  copysign
        
         class VoiceNav:
        
              def __init__( self ):
        
                  rospy.init_node( 'voice_nav' )
        
                  rospy.on_shutdown( self .cleanup)
        
                  # Set a number of parameters affecting the robot's speed
        
                  self .max_speed  = rospy.get_param( "~max_speed" ,  0.4 )
        
                  self .max_angular_speed  = rospy.get_param( "~max_angular_speed" ,  1.5 )
        
                  self .speed  = rospy.get_param( "~start_speed" ,  0.1 )
        
                  self .angular_speed  = rospy.get_param( "~start_angular_speed" ,  0.5 )
        
                  self .linear_increment  = rospy.get_param( "~linear_increment" ,  0.05 )
        
                  self .angular_increment  = rospy.get_param( "~angular_increment" ,  0.4 )
        
                  # We don't have to run the script very fast
        
                  self .rate  = rospy.get_param( "~rate" ,  5 )
        
                  r  = rospy.Rate( self .rate)
        
                  # A flag to determine whether or not voice control is paused
        
                  self .paused  = False
        
                  # Initialize the Twist message we will publish.
        
                  self .cmd_vel  = Twist()
        
                  # Publish the Twist message to the cmd_vel topic
        
                  self .cmd_vel_pub  = rospy.Publisher( 'cmd_vel' , Twist, queue_size = 5 )
        
                  # Subscribe to the /recognizer/output topic to receive voice commands.
        
                  rospy.Subscriber( '/recognizer/output' , String,  self .speech_callback)
        
                  # A mapping from keywords or phrases to commands
        
                  self .keywords_to_command  = { 'stop' : [ 'stop' ,  'halt' ,  'abort' ,  'kill' ,  'panic' ,  'off' ,  'freeze' ,  'shut down' ,  'turn off' ,  'help' ,  'help me' ],
        
                                              'slower' : [ 'slow down' ,  'slower' ],
        
                                              'faster' : [ 'speed up' ,  'faster' ],
        
                                              'forward' : [ 'forward' ,  'ahead' ,  'straight' ],
        
                                              'backward' : [ 'back' ,  'backward' ,  'back up' ],
        
                                              'rotate left' : [ 'rotate left' ],
        
                                              'rotate right' : [ 'rotate right' ],
        
                                              'turn left' : [ 'turn left' ],
        
                                              'turn right' : [ 'turn right' ],
        
                                              'quarter' : [ 'quarter speed' ],
        
                                              'half' : [ 'half speed' ],
        
                                              'full' : [ 'full speed' ],
        
                                              'pause' : [ 'pause speech' ],
        
                                              'continue' : [ 'continue speech' ]}
        
                  rospy.loginfo( "Ready to receive voice commands" )
        
                  # We have to keep publishing the cmd_vel message if we want the robot to keep moving.
        
                  while not rospy.is_shutdown():
        
                      self .cmd_vel_pub.publish( self .cmd_vel)
        
                      r.sleep()                       
        
              def get_command( self , data):
        
                  # Attempt to match the recognized word or phrase to the 
        
                  # keywords_to_command dictionary and return the appropriate
        
                  # command
        
                  for (command, keywords)  in self .keywords_to_command.iteritems():
        
                      for word  in  keywords:
        
                          if data.find(word) >  - 1 :
        
                              return command
        
              def speech_callback( self , msg):
        
                  # Get the motion command from the recognized phrase
        
                  command  = self .get_command(msg.data)
        
                  # Log the command to the screen
        
                  rospy.loginfo( "Command: " + str (command))
        
                  # If the user has asked to pause/continue voice control,
        
                  # set the flag accordingly 
        
                  if command  = = 'pause' :
        
                      self .paused  = True
        
                  elif command  = = 'continue' :
        
                      self .paused  = False
        
                  # If voice control is paused, simply return without
        
                  # performing any action
        
                  if self .paused:
        
                      return      
        
                  # The list of if-then statements should be fairly
        
                  # self-explanatory
        
                  if command  = = 'forward' :    
        
                      self .cmd_vel.linear.x  = self .speed
        
                      self .cmd_vel.angular.z  = 0
        
                  elif command  = = 'rotate left' :
        
                      self .cmd_vel.linear.x  = 0
        
                      self .cmd_vel.angular.z  = self .angular_speed
        
                  elif command  = = 'rotate right' :  
        
                      self .cmd_vel.linear.x  = 0     
        
                      self .cmd_vel.angular.z  = - self .angular_speed
        
                  elif command  = = 'turn left' :
        
                      if self .cmd_vel.linear.x ! = 0 :
        
                          self .cmd_vel.angular.z  + = self .angular_increment
        
                      else :        
        
                          self .cmd_vel.angular.z  = self .angular_speed
        
                  elif command  = = 'turn right' :    
        
                      if self .cmd_vel.linear.x ! = 0 :
        
                          self .cmd_vel.angular.z  - = self .angular_increment
        
                      else :        
        
                          self .cmd_vel.angular.z  = - self .angular_speed
        
                  elif command  = = 'backward' :
        
                      self .cmd_vel.linear.x  = - self .speed
        
                      self .cmd_vel.angular.z  = 0
        
                  elif command  = = 'stop' : 
        
                      # Stop the robot!  Publish a Twist message consisting of all zeros.         
        
                      self .cmd_vel  = Twist()
        
                  elif command  = = 'faster' :
        
                      self .speed  + = self .linear_increment
        
                      self .angular_speed  + = self .angular_increment
        
                      if self .cmd_vel.linear.x ! = 0 :
        
                          self .cmd_vel.linear.x  + = copysign( self .linear_increment,  self .cmd_vel.linear.x)
        
                      if self .cmd_vel.angular.z ! = 0 :
        
                          self .cmd_vel.angular.z  + = copysign( self .angular_increment,  self .cmd_vel.angular.z)
        
                  elif command  = = 'slower' :
        
                      self .speed  - = self .linear_increment
        
                      self .angular_speed  - = self .angular_increment
        
                      if self .cmd_vel.linear.x ! = 0 :
        
                          self .cmd_vel.linear.x  - = copysign( self .linear_increment,  self .cmd_vel.linear.x)
        
                      if self .cmd_vel.angular.z ! = 0 :
        
                          self .cmd_vel.angular.z  - = copysign( self .angular_increment,  self .cmd_vel.angular.z)
        
                  elif command  in  [ 'quarter' ,  'half' ,  'full' ]:
        
                      if command  = = 'quarter' :
        
                          self .speed  = copysign( self .max_speed  / 4 ,  self .speed)
        
                      elif command  = = 'half' :
        
                          self .speed  = copysign( self .max_speed  / 2 ,  self .speed)
        
                      elif command  = = 'full' :
        
                          self .speed  = copysign( self .max_speed,  self .speed)
        
                      if self .cmd_vel.linear.x ! = 0 :
        
                          self .cmd_vel.linear.x  = copysign( self .speed,  self .cmd_vel.linear.x)
        
                      if self .cmd_vel.angular.z ! = 0 :
        
                          self .cmd_vel.angular.z  = copysign( self .angular_speed,  self .cmd_vel.angular.z)
        
                  else :
        
                      return
        
                  self .cmd_vel.linear.x  = min ( self .max_speed,  max ( - self .max_speed,  self .cmd_vel.linear.x))
        
                  self .cmd_vel.angular.z  = min ( self .max_angular_speed,  max ( - self .max_angular_speed,  self .cmd_vel.angular.z))
        
              def cleanup( self ):
        
                  # When shutting down be sure to stop the robot!
        
                  twist  = Twist()
        
                  self .cmd_vel_pub.publish(twist)
        
                  rospy.sleep( 1 )
        
         if  __name__ = = "__main__" :
        
              try :
        
                  VoiceNav()
        
                  rospy.spin()
        
              except rospy.ROSInterruptException:
        
                  rospy.loginfo( "Voice navigation terminated." )

可以看到，代码中定义了接收到各种命令时的控制策略。

2、仿真测试

和前面一样，我们在rviz中进行仿真测试。
首先是运行一个机器人模型：

   
         $ roslaunch rbx1_bringup fake_turtlebot.launch

然后打开rviz：

   
         $ rosrun rviz rviz  - d `rospack find rbx1_nav` / sim.rviz

最后就是机器人的控制节点了：

   
         $ roslaunch rbx1_speech voice_nav_commands.launch         再打开语音识别的节点：$ roslaunch rbx1_speech turtlebot_voice_nav.launch        最后就是机器人的控制节点了：

OK，打开上面这一堆的节点之后，就可以开始了。可以使用的命令如下：

可能是英语太差吧，很难识别；

四、播放语音

现在机器人已经可以按照我们说的话行动了，要是机器人可以和我们对话就更好了。ROS中已经集成了这样的包，下面就来尝试一下。
运行下面的命令：

   
         $ rosrun sound_play soundplay_node.py  
        
         $ rosrun sound_play say.py  "Greetings Humans. Take me to your leader."

有没有听见声音！ROS通过识别我们输入的文本，让机器人读了出来。发出这个声音的人叫做kal_diphone，如果不喜欢，我们也可以换一个人来读：

   
         $ sudo apt - get install festvox - don  
        
         $ rosrun sound_play say.py  "Welcome to the future"  voice_don_diphone

哈哈，这回换了一个人吧，好吧，这不也是我们的重点。
在rbx1_speech/nodes文件夹中有一个让机器人说话的节点talkback.py:

   
         #!/usr/bin/env python
        
         """
        
              talkback.py - Version 1.1 2013-12-20
        
              Use the sound_play client to say back what is heard by the pocketsphinx recognizer.
        
              Created for the Pi Robot Project: http://www.pirobot.org
        
              Copyright (c) 2012 Patrick Goebel.  All rights reserved.
        
              This program is free software; you can redistribute it and/or modify
        
              it under the terms of the GNU General Public License as published by
        
              the Free Software Foundation; either version 2 of the License, or
        
              (at your option) any later version.5
        
              This program is distributed in the hope that it will be useful,
        
              but WITHOUT ANY WARRANTY; without even the implied warranty of
        
              MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
        
              GNU General Public License for more details at:
        
              http://www.gnu.org/licenses/gpl.htmlPoint
        
         """
        
         import rospy
        
         from  std_msgs.msg  import  String
        
         from  sound_play.libsoundplay  import  SoundClient
        
         import sys
        
         class TalkBack:
        
              def __init__( self , script_path):
        
                  rospy.init_node( 'talkback' )
        
                  rospy.on_shutdown( self .cleanup)
        
                  # Set the default TTS voice to use
        
                  self .voice  = rospy.get_param( "~voice" ,  "voice_don_diphone" )
        
                  # Set the wave file path if used
        
                  self .wavepath  = rospy.get_param( "~wavepath" , script_path  + "/../sounds" )
        
                  # Create the sound client object
        
                  self .soundhandle  = SoundClient()
        
                  # Wait a moment to let the client connect to the
        
                  # sound_play server
        
                  rospy.sleep( 1 )
        
                  # Make sure any lingering sound_play processes are stopped.
        
                  self .soundhandle.stopAll()
        
                  # Announce that we are ready for input
        
                  self .soundhandle.playWave( self .wavepath  + "/R2D2a.wav" )
        
                  rospy.sleep( 1 )
        
                  self .soundhandle.say( "Ready" ,  self .voice)
        
                  rospy.loginfo( "Say one of the navigation commands..." )
        
                  # Subscribe to the recognizer output and set the callback function
        
                  rospy.Subscriber( '/recognizer/output' , String,  self .talkback)
        
              def talkback( self , msg):
        
                  # Print the recognized words on the screen
        
                  rospy.loginfo(msg.data)
        
                  # Speak the recognized words in the selected voice
        
                  self .soundhandle.say(msg.data,  self .voice)
        
                  # Uncomment to play one of the built-in sounds
        
                  #rospy.sleep(2)
        
                  #self.soundhandle.play(5)
        
                  # Uncomment to play a wave file
        
                  #rospy.sleep(2)
        
                  #self.soundhandle.playWave(self.wavepath + "/R2D2a.wav")
        
              def cleanup( self ):
        
                  self .soundhandle.stopAll()
        
                  rospy.loginfo( "Shutting down talkback node..." )
        
         if  __name__ = = "__main__" :
        
              try :
        
                  TalkBack(sys.path[ 0 ])
        
                  rospy.spin()
        
              except rospy.ROSInterruptException:
        
                  rospy.loginfo( "Talkback node terminated." )