使用ffmpeg录像,同时进行语音识别
使用javacv中的FrameRecorder进行录像,录像的时候,调用record方法写帧数据和音频数据,这时候我们有一个需求,录像的同时,要把声音实时拿过来进行语音识别。问题1:语音识别用的是讯飞的SDK,要求声音采样率8k或16k。而设置FrameRecorder.setSampleRate(8000)后,再FrameRecorder.start()会报错,报错如下:avcodec_enco
使用javacv中的FrameRecorder进行录像,录像的时候,调用record方法写帧数据和音频数据,这时候我们有一个需求,录像的同时,要把声音实时拿过来进行语音识别。
问题1:
语音识别用的是讯飞的SDK,要求声音采样率8k或16k。而设置FrameRecorder.setSampleRate(8000)后,再FrameRecorder.start()会报错,报错如下:
avcodec_encode_audio2() error 2: Could not encode audio packet.
问题2:
javacv官方录制demo中,从AudioRecord中read到的是ShortBuffer,而讯飞SDK方法要求传入byte,他的方法如下:
public void writeAudio(byte[] data, int start, int length)
百度谷歌无果,只好自己研究。
解决问题1:
demo中默认设置FrameRecorder.setSampleRate(44100)没问题,我想到了一个办法,这个地方设置44100,在语音采集的地方设置8000,最后成功了。不过这个计算时间的方法要修改:
public static int getTimeStampInNsFromSampleCounted(int paramInt) {
// return (int) (paramInt / 0.0441D);
return (int) (paramInt / 0.0080D);
}
解决问题2:
short数组转byte数组,注意数组长度变为原来的2倍
public static byte[] short2byte(short[] sData) {
int shortArrsize = sData.length;
byte[] bytes = new byte[shortArrsize * 2];
for (int i = 0; i < shortArrsize; i++) {
bytes[i * 2] = (byte) (sData[i] & 0x00FF);
bytes[(i * 2) + 1] = (byte) (sData[i] >> 8);
sData[i] = 0;
}
return bytes;
}
/**
* 录制音频的线程
*/
class AudioRecordRunnable implements Runnable {
short[] audioData;
private final AudioRecord audioRecord;
private int mCount = 0;
int sampleRate = Constants.AUDIO_SAMPLING_RATE;
private AudioRecordRunnable() {
int bufferSize = AudioRecord.getMinBufferSize(sampleRate,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT);
audioRecord = new AudioRecord(MediaRecorder.AudioSource.MIC, sampleRate,
AudioFormat.CHANNEL_IN_MONO, AudioFormat.ENCODING_PCM_16BIT, bufferSize);
audioData = new short[bufferSize];
}
/**
* 包含了音频的数据和起始位置
*
* @param buffer
*/
private void record(Buffer buffer) {
synchronized (mAudioRecordLock) {
this.mCount += buffer.limit();
if (!mIsPause) {
try {
if (mRecorder != null) {
mRecorder.record(sampleRate, new Buffer[]{buffer});
}
} catch (FrameRecorder.Exception e) {
e.printStackTrace();
}
}
}
}
/**
* 更新音频的时间戳
*/
private void updateTimestamp() {
int i = Util.getTimeStampInNsFromSampleCounted(this.mCount);
if (mAudioTimestamp != i) {
mAudioTimestamp = i;
mAudioTimeRecorded = System.nanoTime();
}
}
public void run() {
android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO);
if (audioRecord != null) {
//判断音频录制是否被初始化
while (this.audioRecord.getState() == 0) {
try {
Thread.sleep(100L);
} catch (InterruptedException localInterruptedException) {
}
}
this.audioRecord.startRecording();
while ((runAudioThread)) {
updateTimestamp();
int bufferReadResult = this.audioRecord.read(audioData, 0, audioData.length);
if (bufferReadResult > 0) {
if (recording || (mVideoTimestamp > mAudioTimestamp)) {
record(ShortBuffer.wrap(audioData, 0, bufferReadResult));
}
if (SpeechManager.getInstance().isListening()) {
SpeechManager.getInstance().writeAudio(Util.short2byte(audioData), 0, bufferReadResult * 2);
}
}
}
SpeechManager.getInstance().stopListener();
this.audioRecord.stop();
this.audioRecord.release();
}
}
}
CSDN的编辑器我也是服了,刚才发布后看到一堆标签,而且排版错误
更多推荐
所有评论(0)