上篇回顾: 上一篇文章Android硬编解码MediaCodec解析——从猪肉餐馆的故事讲起(一)已经叙述了MediaCodec工作流程和工作周期状态机,今天开始进入实战,从代码角度详细解析MediaCodec。如果没有看过上篇,建议还是看下才能和本文无缝衔接。 MediaCodec代码实例 本次讲解的代码实例是 Google官方MediaCodec的学习项目 grafika ,grafika由多个demo组成,比如视频解码播放、实时录制视频并将视频编码为H264保存本地,录屏等功能,每个demo都有会侧重于某项技术。 以下为grafika的App首页,每一项代表一个demo: 今天,我们就从最基本的第一个demo讲起————解码一个本地MP4视频。 从gif可以看出,这是一个非常简单的视频,整个功能就是对mp4视频进行解码,然后将解码后的数据渲染到屏幕,对应的代码在com.android.grafika.PlayMovieActivity,基本流程结构图如下: 那么最核心的解码代码都在MoviePlayer中。 解复用代码解析 首先要明白的概念是复用,也可以叫做封装,即将已经压缩编码的视频数据和音频数据按照一定的格式打包到一起 ,比如热爱看片的我们都很熟悉的MP4,MKV,RMVB,TS,FLV,AVI,就是复用格式。 比如FLV格式的数据,是由H.264编码的视频码流和AAC编码的音频码流打包一起。 FLV复用格式是由一个FLV Header文件头和一个一个的Tag组成的。Tag中包含了音频数据以及视频数据。FLV的结构如下图所示(图来源于视音频数据处理入门:FLV封装格式解析 C++学习资料免费获取方法:关注音视频开发T哥 ,点击下方链接即可免费获取2023年最新 C++音视频开发进阶独家学习资料! +资料包 「链接」 那么在解码视频之前,就必须先将H264视频数据从复用格式中取出来,Android平台已经提供了MediaExtractor这个工具让我们方便地进行解复用。 以下是官网提供的MediaExtractor使用代码模板: MediaExtractor extractor = new MediaExtractor(); extractor.setDataSource(...); int numTracks = extractor.getTrackCount(); //遍历媒体复用文件中的每一条轨道数据流(音频或者视频流),得到我们需要处理的数据流的mime类型,并选中它 for (int i = 0; i < numTracks; ++i) { MediaFormat format = extractor.getTrackFormat(i); String mime = format.getString(MediaFormat.KEY_MIME); if (weAreInterestedInThisTrack) { //选中我们需要处理的数据流的mime类型的数据流 extractor.selectTrack(i); } } ByteBuffer inputBuffer = ByteBuffer.allocate(...) //循环读取选中的音频或者视频流到inputBuffer中 while (extractor.readSampleData(inputBuffer, ...) >= 0) { int trackIndex = extractor.getSampleTrackIndex(); long presentationTimeUs = extractor.getSampleTime(); ... extractor.advance(); } extractor.release(); extractor = null; 注释已经写的比较详细了,基本能看懂。 首先了解下MediaFormat,它是一个专门描述媒体文件格式的类,内部通过一系列键值对来描述媒体格式,比如通用的媒体格式KEY: 视频专有的格式KEY: 音频专有的格式KEY: 在上面的模板代码中,就是取了KEY_MIME对应的值来判断媒体文件类型。 而常见的视频的mime就有以下: "video/x-vnd.on2.vp8" - VP8 video (i.e. video in .webm) "video/x-vnd.on2.vp9" - VP9 video (i.e. video in .webm) "video/avc" - H.264/AVC video "video/hevc" - H.265/HEVC video "video/mp4v-es" - MPEG4 video "video/3gpp" - H.263 video 因为现在讲的编码主要是H264, 而H264对应的mine就是"video/avc" 。 在grafika中的MoviePlayer的构造方法中com.android.grafika.MoviePlayer#MoviePlayer,就是通过MediaExtractor来获取视频的宽高: //解复用 MediaExtractor extractor = null; try { extractor = new MediaExtractor(); //传入视频文件的路径 extractor.setDataSource(sourceFile.toString()); int trackIndex = selectTrack(extractor); if (trackIndex < 0) { throw new RuntimeException("No video track found in " + mSourceFile); } //选中得到的轨道(视频轨道),即后面都是对此轨道的处理 extractor.selectTrack(trackIndex); //通过该轨道的MediaFormat得到对视频对应的宽高 MediaFormat format = extractor.getTrackFormat(trackIndex); Log.d(TAG, "extractor.getTrackFormat format" + format); //视频对应的宽高 mVideoWidth = format.getInteger(MediaFormat.KEY_WIDTH); mVideoHeight = format.getInteger(MediaFormat.KEY_HEIGHT); if (VERBOSE) { Log.d(TAG, "Video size is " + mVideoWidth + "x" + mVideoHeight); } } finally { if (extractor != null) { extractor.release(); } } 在具体的播放视频方法com.android.grafika.MoviePlayer#play中,通过获取到的mime类型来创建一个MediaCodec解码器: MediaFormat format = extractor.getTrackFormat(trackIndex); Log.d(TAG, "EgetTrackFormat format:" + format); // Create a MediaCodec decoder, and configure it with the MediaFormat from the // extractor. It"s very important to use the format from the extractor because // it contains a copy of the CSD-0/CSD-1 codec-specific data chunks. String mime = format.getString(MediaFormat.KEY_MIME); Log.d(TAG, "createDecoderByType mime:" + mime); //通过视频mime类型初始化解码器 MediaCodec decoder = MediaCodec.createDecoderByType(mime); 此时MediaCodec处于Stopped状态中的Uninitialized状态,接下来开始启动MediaCodec(老板收拾厨房桌椅,要开店了): //配置解码器,指定MediaFormat以及视频输出的Surface,解码器进入configure状态 decoder.configure(format, mOutputSurface, null, 0); //启动解码器,开始进入Executing状态 // Immediately after start() the codec is in the Flushed sub-state, where it holds all the buffers decoder.start(); //具体的解码流程 doExtract(extractor, trackIndex, decoder, mFrameCallback); 注意到configure方法传了mOutputSurface的Surface对象,在# Android硬编解码利器MediaCodec解析——从猪肉餐馆的故事讲起(一) 讲过,对于原始视频数据来说: 视频编解码支持三种色彩格式,其中第二种就是 native raw video format : COLOR_FormatSurface,可以用来处理surface模式的数据输入输出。而这个Surface对象是从Activity的TextureView获取到的: //MoviePlayer通过Surface将解码后的原始视频数据渲染到TextureView上 SurfaceTexture st = mTextureView.getSurfaceTexture(); Surface surface = new Surface(st); MoviePlayer player = null; try { player = new MoviePlayer( new File(getFilesDir(), mMovieFiles[mSelectedMovie]), surface, callback); } catch (IOException ioe) { Log.e(TAG, "Unable to play movie", ioe); surface.release(); return; }解码代码解析 此时MediaCodec已经启动,此时已经进入input端和output端的大循环阶段(头脑中开始想象采购员一次又一次将生猪肉装进篮子中交给厨师,厨师做完又放在盘子上送给顾客的循环的场景)。关键代码看com.android.grafika.MoviePlayer#doExtract: /** * Work loop. We execute here until we run out of video or are told to stop. */ private void doExtract(MediaExtractor extractor, int trackIndex, MediaCodec decoder, FrameCallback frameCallback) { // We need to strike a balance between providing input and reading output that // operates efficiently without delays on the output side. // // To avoid delays on the output side, we need to keep the codec"s input buffers // fed. There can be significant latency between submitting frame N to the decoder // and receiving frame N on the output, so we need to stay ahead of the game. // // Many video decoders seem to want several frames of video before they start // producing output -- one implementation wanted four before it appeared to // configure itself. We need to provide a bunch of input frames up front, and try // to keep the queue full as we go. // // (Note it"s possible for the encoded data to be written to the stream out of order, // so we can"t generally submit a single frame and wait for it to appear.) // // We can"t just fixate on the input side though. If we spend too much time trying // to stuff the input, we might miss a presentation deadline. At 60Hz we have 16.7ms // between frames, so sleeping for 10ms would eat up a significant fraction of the // time allowed. (Most video is at 30Hz or less, so for most content we"ll have // significantly longer.) Waiting for output is okay, but sleeping on availability // of input buffers is unwise if we need to be providing output on a regular schedule. // // // In some situations, startup latency may be a concern. To minimize startup time, // we"d want to stuff the input full as quickly as possible. This turns out to be // somewhat complicated, as the codec may still be starting up and will refuse to // accept input. Removing the timeout from dequeueInputBuffer() results in spinning // on the CPU. // // If you have tight startup latency requirements, it would probably be best to // "prime the pump" with a sequence of frames that aren"t actually shown (e.g. // grab the first 10 NAL units and shove them through, then rewind to the start of // the first key frame). // // The actual latency seems to depend on strongly on the nature of the video (e.g. // resolution). // // // One conceptually nice approach is to loop on the input side to ensure that the codec // always has all the input it can handle. After submitting a buffer, we immediately // check to see if it will accept another. We can use a short timeout so we don"t // miss a presentation deadline. On the output side we only check once, with a longer // timeout, then return to the outer loop to see if the codec is hungry for more input. // // In practice, every call to check for available buffers involves a lot of message- // passing between threads and processes. Setting a very brief timeout doesn"t // exactly work because the overhead required to determine that no buffer is available // is substantial. On one device, the "clever" approach caused significantly greater // and more highly variable startup latency. // // The code below takes a very simple-minded approach that works, but carries a risk // of occasionally running out of output. A more sophisticated approach might // detect an output timeout and use that as a signal to try to enqueue several input // buffers on the next iteration. // // If you want to experiment, set the VERBOSE flag to true and watch the behavior // in logcat. Use "logcat -v threadtime" to see sub-second timing. //获取解码输出数据的超时时间 final int TIMEOUT_USEC = 0; //输入ByteBuffer数组(较高版本的MediaCodec已经用getInputBuffer取代了,可直接获取buffer) ByteBuffer[] decoderInputBuffers = decoder.getInputBuffers(); //记录传入了第几块数据 int inputChunk = 0; //用于log每帧解码时间 long firstInputTimeNsec = -1; boolean outputDone = false; boolean inputDone = false; while (!outputDone) { if (VERBOSE) Log.d(TAG, "loop"); if (mIsStopRequested) { Log.d(TAG, "Stop requested"); return; } // Feed more data to the decoder. if (!inputDone) { //拿到可用的ByteBuffer的index int inputBufIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC); if (inputBufIndex >= 0) { if (firstInputTimeNsec == -1) { firstInputTimeNsec = System.nanoTime(); } //根据index得到对应的输入ByteBuffer ByteBuffer inputBuf = decoderInputBuffers[inputBufIndex]; Log.d(TAG, "decoderInputBuffers inputBuf:" + inputBuf + ",inputBufIndex:" + inputBufIndex); // Read the sample data into the ByteBuffer. This neither respects nor // updates inputBuf"s position, limit, etc. //从媒体文件中读取的一个sample数据大小 int chunkSize = extractor.readSampleData(inputBuf, 0); if (chunkSize < 0) { //文件读到末尾,设置标志位,发送一个空帧,给后面解码知道具体结束位置 // End of stream -- send empty frame with EOS flag set. //When you queue an input buffer with the end-of-stream marker, the codec transitions // to the End-of-Stream sub-state. In this state the codec no longer accepts further // input buffers, but still generates output buffers until the end-of-stream is reached // on the output. decoder.queueInputBuffer(inputBufIndex, 0, 0, 0L, MediaCodec.BUFFER_FLAG_END_OF_STREAM); Log.d(TAG, "queueInputBuffer"); inputDone = true; if (VERBOSE) Log.d(TAG, "sent input EOS"); } else { if (extractor.getSampleTrackIndex() != trackIndex) { Log.w(TAG, "WEIRD: got sample from track " + extractor.getSampleTrackIndex() + ", expected " + trackIndex); } //得到当前数据的播放时间点 long presentationTimeUs = extractor.getSampleTime(); //把inputBufIndex对应的数据传入MediaCodec decoder.queueInputBuffer(inputBufIndex, 0, chunkSize, presentationTimeUs, 0 /*flags*/); Log.d(TAG, "queueInputBuffer inputBufIndex:" + inputBufIndex); if (VERBOSE) { Log.d(TAG, "submitted frame " + inputChunk + " to dec, size=" + chunkSize); } //记录传入了第几块数据 inputChunk++; //extractor读取游标往前挪动 extractor.advance(); } } else { if (VERBOSE) Log.d(TAG, "input buffer not available"); } } if (!outputDone) { //如果解码成功,则得到解码出来的数据的buffer在输出buffer中的index。并将解码得到的buffer的相关信息放在mBufferInfo中。 // 如果不成功,则得到的是解码的一些状态 int outputBufferIndex = decoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC); Log.d(TAG, "dequeueOutputBuffer decoderBufferIndex:" + outputBufferIndex + ",mBufferInfo:" + mBufferInfo); if (outputBufferIndex == MediaCodec.INFO_TRY_AGAIN_LATER) { // no output available yet if (VERBOSE) Log.d(TAG, "no output from decoder available"); } else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) { // not important for us, since we"re using Surface if (VERBOSE) Log.d(TAG, "decoder output buffers changed"); } else if (outputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) { MediaFormat newFormat = decoder.getOutputFormat(); if (VERBOSE) Log.d(TAG, "decoder output format changed: " + newFormat); } else if (outputBufferIndex < 0) { throw new RuntimeException( "unexpected result from decoder.dequeueOutputBuffer: " + outputBufferIndex); } else { // decoderStatus >= 0 if (firstInputTimeNsec != 0) { // Log the delay from the first buffer of input to the first buffer // of output. long nowNsec = System.nanoTime(); Log.d(TAG, "startup lag " + ((nowNsec - firstInputTimeNsec) / 1000000.0) + " ms"); firstInputTimeNsec = 0; } boolean doLoop = false; if (VERBOSE) Log.d(TAG, "surface decoder given buffer " + outputBufferIndex + " (output mBufferInfo size=" + mBufferInfo.size + ")"); //判断是否到了文件结束,上面设置MediaCodec.BUFFER_FLAG_END_OF_STREAM标志位在这里判断 if ((mBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) { if (VERBOSE) Log.d(TAG, "output EOS"); if (mLoop) { doLoop = true; } else { outputDone = true; } } //如果解码得到的buffer大小大于0,则需要渲染 boolean doRender = (mBufferInfo.size != 0); // As soon as we call releaseOutputBuffer, the buffer will be forwarded // to SurfaceTexture to convert to a texture. We can"t control when it // appears on-screen, but we can manage the pace at which we release // the buffers. if (doRender && frameCallback != null) { //渲染前的回调,这里具体实现是通过一定时长的休眠来尽量确保稳定的帧率 frameCallback.preRender(mBufferInfo.presentationTimeUs); } //得到输出Buffer数组,较高版本已经被getOutputBuffer代替 ByteBuffer[] decoderOutputBuffers = decoder.getOutputBuffers(); Log.d(TAG, "ecoderOutputBuffers.length:" + decoderOutputBuffers.length); //将输出buffer数组的第outputBufferIndex个buffer绘制到surface。doRender为true绘制到配置的surface decoder.releaseOutputBuffer(outputBufferIndex, doRender); if (doRender && frameCallback != null) { //渲染后的回调 frameCallback.postRender(); } if (doLoop) { Log.d(TAG, "Reached EOS, looping"); //需要循环的话,重置extractor的游标到初始位置。 extractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC); inputDone = false; //重置decoder到Flushed状态,不然就没法开始新一轮播放 // You can move back to the Flushed sub-state at any time while // in the Executing state using flush(). //You can move back to the Flushed sub-state at any time while in the Executing state using flush() decoder.flush(); // reset decoder state frameCallback.loopReset(); } } } } } 代码有官方和我加上的详细注释,这里主要挑几个重点讲下: 1.采购员向厨师询问有无篮子可用:首先询问Mediacodec当前有没有可以input的Buffer可以使用: int inputBufIndex = decoder.dequeueInputBuffer(TIMEOUT_USEC); 方法定义是: /** * Returns the index of an input buffer to be filled with valid data * or -1 if no such buffer is currently available. * This method will return immediately if timeoutUs == 0, wait indefinitely * for the availability of an input buffer if timeoutUs < 0 or wait up * to "timeoutUs" microseconds if timeoutUs > 0. * @param timeoutUs The timeout in microseconds, a negative timeout indicates "infinite". * @throws IllegalStateException if not in the Executing state, * or codec is configured in asynchronous mode. * @throws MediaCodec.CodecException upon codec error. */ public final int dequeueInputBuffer(long timeoutUs) { int res = native_dequeueInputBuffer(timeoutUs); if (res >= 0) { synchronized(mBufferLock) { validateInputByteBuffer(mCachedInputBuffers, res); } } return res; } TIMEOUT_USEC为等待超时时间。当返回的inputBufIndex大于等于0,则说明当前有可用的Buffer,此时inputBufIndex表示可用Buffer在Mediacodec中的序号。如果等待了TIMEOUT_USEC时间还没找到可用的Buffer,则返回inputBufIndex小于0,等下次循环再来取Buffer。 2.采购员将生猪肉装进篮子中并交给厨师:每次从MediaExtractor中的readSampleData方法读出视频一段数据放在ByteBuffer中,然后通过Mediacodec的queueInputBuffer将ByteBuffer传给Mediacodec内部处理。 //从媒体文件中读取的一个sample数据大小到inputBuf中 int chunkSize = extractor.readSampleData(inputBuf, 0); 方法定义: /** * Retrieve the current encoded sample and store it in the byte buffer * starting at the given offset. ** Note:As of API 21, on success the position and limit of * {@code byteBuf} is updated to point to the data just read. * @param byteBuf the destination byte buffer * @return the sample size (or -1 if no more samples are available). */ public native int readSampleData(@NonNull ByteBuffer byteBuf, int offset); Android硬编解码MediaCodec解析——从猪肉餐馆的故事讲起(一)中讲过,根据官网描述,一般如果是视频文件数据,则都不要传递给Mediacodec不是完整帧的数据,除非是标记了BUFFER_FLAG_PARTIAL_FRAME的数据。所以这里可以推断readSampleData方法是读取一帧的数据,后面我会对其进行验证。 返回值为读取到数据大小,所以如果返回值大于0 ,则说明是有读取到数据的,则将数据传入MediaCodec中: //得到当前数据的播放时间点 long presentationTimeUs = extractor.getSampleTime(); //把inputBufIndex对应的数据传入MediaCodec decoder.queueInputBuffer(inputBufIndex, 0, chunkSize, presentationTimeUs, 0 /*flags*/); 关于queueInputBuffer方法,定义的注释实在太长了,简单来说,这里就是将input端第inputBufIndex个Buffer从第0位开始chunkSize个字节数据传入MediaCodec中,并指定这一帧数据的渲染时间为presentationTimeUs,在解析H264视频编码原理——从孙艺珍的电影说起(一)曾经说过 这里由于B帧的引入,会导致一个现象,就是 编码的帧顺序和播放的帧顺序会不一致,所以也衍生了pts和dts2个时间戳(编码时间和播放时间) 这里的presentationTimeUs就是pts,因为解码后的帧数据可能不是和播放顺序一样的,需要presentationTimeUs来指定播放顺序。最后一个参数flags是对传入的数据描述用的标志位,一般用于一些特殊情况,这里传0即可。 如果readSampleData方法返回值,即读到的数据大小为负数 ,则说明已经读到视频文件尾部了,则还是调用queueInputBuffer方法,但是需要特殊处理: decoder.queueInputBuffer(inputBufIndex, 0, 0, 0L, MediaCodec.BUFFER_FLAG_END_OF_STREAM); 发送一个空帧,标志位传BUFFER_FLAG_END_OF_STREAM ,告诉MediaCodec,已经到文件尾部了,这个文件没有剩下需要传的数据了,即采购员告诉厨师,已经没有生猪肉了。 发送了这个表示结束的空帧之后,就不能再传数据给input端了,一直到MediaCodec进入了flushed状态, 或者进入stopped 之后再start之后才可以重新传入数据给input端。 input端的代码就到这,然后马不停蹄,立刻到ouptut端去尝试获取一下output的buffer(顾客走到厨师面前,问猪肉炒好了没有): int outputBufferIndex = decoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC); 如果不成功(厨师对顾客说猪肉还没炒好),则得到的是解码的一些状态 ,在项目代码中,列出了以下几种常见 的状态: 1. MediaCodec.INFO_TRY_AGAIN_LATER :表示等了TIMEOUT_USEC时间长,也暂时还没有解码出成功的数据。一般来说,一个是等待时间还不够,另一个就是输入端是B帧,需要后面一帧P帧来作为参考帧才可以解码(关于B帧P帧详见# 解析H264视频编码原理——从孙艺珍的电影说起(一)) 2. MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED :输出Buffer数组已经过时,需要及时更换,由于较新版本已经用getOutputBuffer获取输出Buffer了,所以该标志位也过时了。 3. MediaCodec.INFO_OUTPUT_FORMAT_CHANGED :输出数据的MediaFormat发生了变化。 如果解码成功,则得到解码出来的数据的buffer在输出buffer中的index 。并将解码得到的buffer的相关信息放在mBufferInfo中。然后执行非常关键的一段代码: decoder.releaseOutputBuffer(outputBufferIndex, doRender); 将输出buffer数组的第outputBufferIndex个buffer绘制到surface(还记得configure方法传了的Surface对象么)。doRender为true,绘制到配置的surface。可以理解这行代码就类似Android中Canvas的draw方法,调用就绘制一帧,并将Buffer回收。 总结 美好的时光总是如此短暂,我觉得解码的关键代码应该已经讲得比较细致了吧~ 为了避免篇幅过长导致读者看了容易打瞌睡,我还是先到此为止把,下一篇博文 # Android硬编解码工具MediaCodec解析——从猪肉餐馆的故事讲起(三)将讲解本文代码运行后的一些要点和注意细节,敬请关注~~ 参考: 视音频数据处理入门:FLV封装格式解析 MediaCodec官网 安卓解码器MediaCodec解析 作者:半岛铁盒里的猫 链接:https://juejin.cn/post/7111340889691127815/ 来源:稀土掘金 著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。 在开发的路上你不是一个人,欢迎加入 C++音视频开发交流群「链接」大家庭讨论交流!