在DirectShow中支持DXVA 2.0(Supporting DXVA 2.0 in DirectShow)

时间:2023-03-08 21:50:07

  这几天在做dxva2硬件加速,找不到什么资料,翻译了一下微软的两篇相关文档。并准备记录一下用ffmpeg实现dxva2,将在第三篇写到。这是第二篇。,英文原址:https://msdn.microsoft.com/en-us/library/aa965245(v=vs.85).aspx 
第一篇翻译的Direct3D device manager,链接:http://www.cnblogs.com/betterwgo/p/6124588.html

  本主题描述如何在DirectShow的解码器中支持DirectX Video Acceleration (DXVA) 2.0。具体而言,是描述解码器与视频渲染器之间的联通(communication )。本主题不描述如何实现DXVA解码。

1.准备(Prerequisites)

本主题假定你熟悉如何写DirectShow过滤器。更多信息请参考DirectShow SDK文档的Writing DirectShow Filters主题(https://msdn.microsoft.com/en-us/library/dd391013(v=vs.85).aspx)。代码简例假定解码器继承自CTransformFilter类,定义如下:

class CDecoder : public CTransformFilter
{
public:
static CUnknown* WINAPI CreateInstance(IUnknown *pUnk, HRESULT *pHr); HRESULT CompleteConnect(PIN_DIRECTION direction, IPin *pPin); HRESULT InitAllocator(IMemAllocator **ppAlloc);
HRESULT DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *pProp); // TODO: The implementations of these methods depend on the specific decoder.
HRESULT CheckInputType(const CMediaType *mtIn);
HRESULT CheckTransform(const CMediaType *mtIn, const CMediaType *mtOut);
HRESULT CTransformFilter::GetMediaType(int,CMediaType *); private:
CDecoder(HRESULT *pHr);
~CDecoder(); CBasePin * GetPin(int n); HRESULT ConfigureDXVA2(IPin *pPin);
HRESULT SetEVRForDXVA2(IPin *pPin); HRESULT FindDecoderConfiguration(
/* [in] */ IDirectXVideoDecoderService *pDecoderService,
/* [in] */ const GUID& guidDecoder,
/* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig,
/* [out] */ BOOL *pbFoundDXVA2Configuration
); private:
IDirectXVideoDecoderService *m_pDecoderService; DXVA2_ConfigPictureDecode m_DecoderConfig;
GUID m_DecoderGuid;
HANDLE m_hDevice; FOURCC m_fccOutputFormat;
};

本主题中,解码器是指decoder filter,包括接收压缩视频数据到输出解压缩的视频数据的过程。解码设备指图形驱动所实现的硬件视频加速器。

一个解码器要支持DXVA 2.0必须有以下基本步骤:

(1)确定一个文件类型(个人理解:应该是指根据获取到的原文件类型,找到DXVA2对应的文件类型。比如ffmpeg获取到了文件类型,要知道这个文件类型在DXVA2中对应的是什么文件类型)

(2)找到对应的DXVA解码器配置

(3)告知视频渲染设备解码器用的是DXVA

(4)提供一个客户分配器来分配Direct3D surfaces.

原文:

在DirectShow中支持DXVA 2.0(Supporting DXVA 2.0 in DirectShow)

2.变更提示(Migration Notes)

如果你是从DXVA 1.0变更到DXVA 2.0,你需要注意这两个版本之间的以下一些重大区别:

(1)DXVA 2.0不使用 IAMVideoAccelerator 和 IAMVideoAcceleratorNotify 接口,因为解码器可以通过 IDirectXVideoDecoder 接口直接获得DXVA 2.0 的API

(2)确定文件类型时(原文:During media type negotiation),解码器不用video acceleration GUID做为子类型,子类型直接为和软解一样的解压缩的视频格式(如NV12)

(3)配置加速器的流程变更了。在DXVA 1.0 ,解码器调用带DXVA_ConfigPictureDecode结构的Execute函数来配置加速器。在DXVA 2.0中,解码器用IDirectXVideoDecoderService接口来配置,下一部分将会讲到。

(4)由解码器来分配解压缩数据的缓存,不再由视频渲染器来做这项工作。

(5)不再用IAMVideoAccelerator::DisplayFrame来显示解码帧,与软解一样,解码器调用IMemInputPin::Receive函数把解码帧数据传给渲染器

(6)解码器不再检查什么时候数据缓存是安全可更新的(原文:The decoder is no longer responsible for checking when data buffers are safe for updates)。因此DXVA 2.0没有任何方法(或函数,原文:method)是与IAMVideoAccelerator::QueryRenderStatus等效的。

(7)子像素混合(原文:Subpicture blending)由视频渲染器调用DXVA2.0视频处理API来做。提供子像素的解码器(如DVD解码器)应当把子像素数据发送到一个独立的输出Pin。(原文:Subpicture blending is done by the video renderer, using the DXVA2.0 video processor APIs. Decoders that provide subpictures (for example, DVD decoders) should send subpicture data on a separate output pin.)

对于解码操作,DXVA 2.0与DXVA 1.0用的相同的数据结构(原文:data structures)。(个人理解:这里的数据结构应该是指存储数据的结构体)

EVR过滤器支持DXVA 2.0。视频混合器(原文:Video Mixing Renderer filters)(VMR-7 和 VMR-9)仅支持DXVA 1.0。

3.查找解码器配置(Finding a Decoder Configuration)

解码器确定了输出媒体类型后,必须给DXVA解码器设备找到一个兼容的配置。你可以在输出Pin的CBaseOutputPin::CompleteConnect方法中完成这个步骤。这一步确保图形驱动器在解码器用DXVA之前支持解码器所需要的能力(原文:This step ensures that the graphics driver supports the capabilities needed by the decoder, before the decoder commits to using DXVA.)。

以下是为解码器设备查找配置:

1)为IMFGetService接口查询渲染器输入Pin

2)调用IMFGetService::GetService以获取IDirect3DDeviceManager9接口的指针。这项服务的GUID是MR_VIDEO_ACCELERATION_SERVICE。

3)调用IDirect3DDeviceManager9::OpenDeviceHandle以获取渲染器的Direct3D 设备的句柄。

4)调用IDirect3DDeviceManager9::GetVideoService并传入设备句柄。这个方法返回一个指向IDirectXVideoDecoderService接口的指针。

5)调用IDirectXVideoDecoderService::GetDecoderDeviceGuids。这个方法返回一个解码设备GUID的数组。

6)循环查找解码器GUID数组找到解码器支持的GUID。如,一个MPEG-2解码器,你可以查找DXVA2_ModeMPEG2_MOCOMP, DXVA2_ModeMPEG2_IDCT, 或者 DXVA2_ModeMPEG2_VLD。

7)当你找到一个可能的解码设备GUID,把GUID传给IDirectXVideoDecoderService::GetDecoderRenderTargets方法。这个方法返回一个渲染器目标格式数组,指定为D3DFORMAT 格式(原文:This method returns an array of render target formats, specified as D3DFORMAT values.)。

8)循环查找到匹配你的输出格式的渲染器目标格式。特别地,一个解码器只支持一个渲染目标格式。解码器将用这个子类型与渲染器连接。In the first call to CompleteConnect(不懂,不知道怎么翻译,大概CompleteConnect是个什么函数),解码器可以决定渲染目标格式,然后返回这个格式作为一个首选的输出类型。

9)调用IDirectXVideoDecoderService::GetDecoderConfigurations。传入相同的解码设备GUID,以及描述预期格式的DXVA2_VideoDesc结构。这个方法返回一个DXVA2_ConfigPictureDecode结构的数组。每个结构描述一个可能的解码器设备配置。

10)假定以上步骤都成功了,保存Direct3D 设备句柄、解码器设备GUID和所配置的结构(原文:and the configuration structure)。过滤器将用这个信息去创建解码器设备。

以下代码展示如何查找一个解码器设备:

HRESULT CDecoder::ConfigureDXVA2(IPin *pPin)
{
UINT cDecoderGuids = ;
BOOL bFoundDXVA2Configuration = FALSE;
GUID guidDecoder = GUID_NULL; DXVA2_ConfigPictureDecode config;
ZeroMemory(&config, sizeof(config)); // Variables that follow must be cleaned up at the end. IMFGetService *pGetService = NULL;
IDirect3DDeviceManager9 *pDeviceManager = NULL;
IDirectXVideoDecoderService *pDecoderService = NULL; GUID *pDecoderGuids = NULL; // size = cDecoderGuids
HANDLE hDevice = INVALID_HANDLE_VALUE; // Query the pin for IMFGetService.
HRESULT hr = pPin->QueryInterface(IID_PPV_ARGS(&pGetService)); // Get the Direct3D device manager.
if (SUCCEEDED(hr))
{
hr = pGetService->GetService( MR_VIDEO_ACCELERATION_SERVICE,
IID_PPV_ARGS(&pDeviceManager)
);
} // Open a new device handle.
if (SUCCEEDED(hr))
{
hr = pDeviceManager->OpenDeviceHandle(&hDevice);
} // Get the video decoder service.
if (SUCCEEDED(hr))
{
hr = pDeviceManager->GetVideoService(
hDevice, IID_PPV_ARGS(&pDecoderService));
} // Get the decoder GUIDs.
if (SUCCEEDED(hr))
{
hr = pDecoderService->GetDecoderDeviceGuids(
&cDecoderGuids, &pDecoderGuids);
} if (SUCCEEDED(hr))
{
// Look for the decoder GUIDs we want.
for (UINT iGuid = ; iGuid < cDecoderGuids; iGuid++)
{
// Do we support this mode?
if (!IsSupportedDecoderMode(pDecoderGuids[iGuid]))
{
continue;
} // Find a configuration that we support.
hr = FindDecoderConfiguration(pDecoderService, pDecoderGuids[iGuid],
&config, &bFoundDXVA2Configuration);
if (FAILED(hr))
{
break;
} if (bFoundDXVA2Configuration)
{
// Found a good configuration. Save the GUID and exit the loop.
guidDecoder = pDecoderGuids[iGuid];
break;
}
}
} if (!bFoundDXVA2Configuration)
{
hr = E_FAIL; // Unable to find a configuration.
} if (SUCCEEDED(hr))
{
// Store the things we will need later. SafeRelease(&m_pDecoderService);
m_pDecoderService = pDecoderService;
m_pDecoderService->AddRef(); m_DecoderConfig = config;
m_DecoderGuid = guidDecoder;
m_hDevice = hDevice;
} if (FAILED(hr))
{
if (hDevice != INVALID_HANDLE_VALUE)
{
pDeviceManager->CloseDeviceHandle(hDevice);
}
} SafeRelease(&pGetService);
SafeRelease(&pDeviceManager);
SafeRelease(&pDecoderService);
return hr;
}
HRESULT CDecoder::FindDecoderConfiguration(
/* [in] */ IDirectXVideoDecoderService *pDecoderService,
/* [in] */ const GUID& guidDecoder,
/* [out] */ DXVA2_ConfigPictureDecode *pSelectedConfig,
/* [out] */ BOOL *pbFoundDXVA2Configuration
)
{
HRESULT hr = S_OK;
UINT cFormats = ;
UINT cConfigurations = ; D3DFORMAT *pFormats = NULL; // size = cFormats
DXVA2_ConfigPictureDecode *pConfig = NULL; // size = cConfigurations // Find the valid render target formats for this decoder GUID.
hr = pDecoderService->GetDecoderRenderTargets(
guidDecoder,
&cFormats,
&pFormats
); if (SUCCEEDED(hr))
{
// Look for a format that matches our output format.
for (UINT iFormat = ; iFormat < cFormats; iFormat++)
{
if (pFormats[iFormat] != (D3DFORMAT)m_fccOutputFormat)
{
continue;
} // Fill in the video description. Set the width, height, format,
// and frame rate.
DXVA2_VideoDesc videoDesc = {}; FillInVideoDescription(&videoDesc); // Private helper function.
videoDesc.Format = pFormats[iFormat]; // Get the available configurations.
hr = pDecoderService->GetDecoderConfigurations(
guidDecoder,
&videoDesc,
NULL, // Reserved.
&cConfigurations,
&pConfig
); if (FAILED(hr))
{
break;
} // Find a supported configuration.
for (UINT iConfig = ; iConfig < cConfigurations; iConfig++)
{
if (IsSupportedDecoderConfig(pConfig[iConfig]))
{
// This configuration is good.
*pbFoundDXVA2Configuration = TRUE;
*pSelectedConfig = pConfig[iConfig];
break;
}
} CoTaskMemFree(pConfig);
break; } // End of formats loop.
} CoTaskMemFree(pFormats); // Note: It is possible to return S_OK without finding a configuration.
return hr;
}

由于这是个通用的例子,所以有些逻辑就放置在了辅助函数里面,需要由解码器来实现。以下是所用到的辅助函数:

// Returns TRUE if the decoder supports a given decoding mode.
BOOL IsSupportedDecoderMode(const GUID& mode); // Returns TRUE if the decoder supports a given decoding configuration.
BOOL IsSupportedDecoderConfig(const DXVA2_ConfigPictureDecode& config); // Fills in a DXVA2_VideoDesc structure based on the input format.
void FillInVideoDescription(DXVA2_VideoDesc *pDesc);

4.通知视频渲染器(Notifying the Video Renderer

如果解码器找到了解码配置,下一步就是通知视频渲染器将要使用硬件加速来解码。你可以在CompleteConnect方法中完成这个步骤。这一步必须在选择分配器之前做,因为它会影响分配器如何选择。

1)为IMFGetService接口查询渲染器的输入Pin(原文:Query the renderer's input pin for the IMFGetService interface.)

2)调用IMFGetService::GetService获取指向IDirectXVideoMemoryConfiguration接口的指针。该服务的GUID是MR_VIDEO_ACCELERATION_SERVICE。

3)循环调用IDirectXVideoMemoryConfiguration::GetAvailableSurfaceTypeByIndex,从0增长dwTypeIndex 变量。当该方法在pdwType 参数返回DXVA2_SurfaceType_DecoderRenderTarget 时停止循环。这一步确保视频渲染器支持硬件加速转码。对于EVR过滤器而言这一步总是成功的。

4)如果上一步成功,用DXVA2_SurfaceType_DecoderRenderTarget参数调用IDirectXVideoMemoryConfiguration::SetSurfaceType。用这个参数调用SetSurfaceType将视频渲染器置于DXVA模式。当视频渲染器处于这种模式时,解码器必须提供它自己的分配器。

以下代码展示如何通知视频渲染器:

HRESULT CDecoder::SetEVRForDXVA2(IPin *pPin)
{
HRESULT hr = S_OK; IMFGetService *pGetService = NULL;
IDirectXVideoMemoryConfiguration *pVideoConfig = NULL; // Query the pin for IMFGetService.
hr = pPin->QueryInterface(__uuidof(IMFGetService), (void**)&pGetService); // Get the IDirectXVideoMemoryConfiguration interface.
if (SUCCEEDED(hr))
{
hr = pGetService->GetService(
MR_VIDEO_ACCELERATION_SERVICE, IID_PPV_ARGS(&pVideoConfig));
} // Notify the EVR.
if (SUCCEEDED(hr))
{
DXVA2_SurfaceType surfaceType; for (DWORD iTypeIndex = ; ; iTypeIndex++)
{
hr = pVideoConfig->GetAvailableSurfaceTypeByIndex(iTypeIndex, &surfaceType); if (FAILED(hr))
{
break;
} if (surfaceType == DXVA2_SurfaceType_DecoderRenderTarget)
{
hr = pVideoConfig->SetSurfaceType(DXVA2_SurfaceType_DecoderRenderTarget);
break;
}
}
} SafeRelease(&pGetService);
SafeRelease(&pVideoConfig); return hr;
}

如果解码器找到了有效的配置并成功通知了视频渲染器,解码器就可以用DXVA来解码了。解码器必须给输出Pin实现客户分配器(原为:a custom allocator),如下面一部分描述的。

5.分配解码数据缓存(Allocating Uncompressed Buffers)

在DXVA 2.0中,解码器负责分配作为解压缩视频数据缓存的Direct3D surfaces。因此,解码器必须实现一个创建surfaces的custom allocator(不知道怎么翻译,不翻译了,意思大概是由用户来实现的分配器)。这个分配器提供的media samples会有一个指向Direct3D surfaces的指针。EVR通过调用这个media sample的IMFGetService::GetService取回这个指向surface的指针。这个服务的标识符是MR_BUFFER_SERVICE。

要实现custom allocator,需执行以下步骤:

1)给media samples定义一个类。这个类继承自CMediaSample。在这个类中,做以下:

a)保存一个指向the Direct3D surface的指针;b)实现IMFGetService接口。在GetService方法中,如果service GUID i是MR_BUFFER_SERVICE,query the Direct3D surface for the requested interface。否则,GetService 会返回MF_E_UNSUPPORTED_SERVICE。c)重写CMediaSample::GetPointer 方法来返回 E_NOTIMPL.

2)给the allocator定义一个类。the allocator可以继承自CBaseAllocator类。在这个类中,做以下:

a)重写CBaseAllocator::Alloc方法。在这个方法中,调用IDirectXVideoAccelerationService::CreateSurface创建surface。( IDirectXVideoDecoderService 接口从IDirectXVideoAccelerationService继承这个方法)。b)重写CBaseAllocator::Free方法释放surface。

3)在你的过滤器的输出Pin中,重写CBaseOutputPin::InitAllocator方法。在这个方法中,创建一个你实现的custom allocator的实例。

4)在你的filter中,实现CTransformFilter::DecideBufferSize方法。pProperties 参数表明EVR所需的surface的数量。把这个值增加的解码器所需的大小,并在allocator中调用IMemAllocator::SetProperties。

以下代码展示如何实现media sample类:

class CDecoderSample : public CMediaSample, public IMFGetService
{
friend class CDecoderAllocator; public: CDecoderSample(CDecoderAllocator *pAlloc, HRESULT *phr)
: CMediaSample(NAME("DecoderSample"), (CBaseAllocator*)pAlloc, phr, NULL, ),
m_pSurface(NULL),
m_dwSurfaceId()
{
} // Note: CMediaSample does not derive from CUnknown, so we cannot use the
// DECLARE_IUNKNOWN macro that is used by most of the filter classes. STDMETHODIMP QueryInterface(REFIID riid, void **ppv)
{
CheckPointer(ppv, E_POINTER); if (riid == IID_IMFGetService)
{
*ppv = static_cast<IMFGetService*>(this);
AddRef();
return S_OK;
}
else
{
return CMediaSample::QueryInterface(riid, ppv);
}
}
STDMETHODIMP_(ULONG) AddRef()
{
return CMediaSample::AddRef();
} STDMETHODIMP_(ULONG) Release()
{
// Return a temporary variable for thread safety.
ULONG cRef = CMediaSample::Release();
return cRef;
} // IMFGetService::GetService
STDMETHODIMP GetService(REFGUID guidService, REFIID riid, LPVOID *ppv)
{
if (guidService != MR_BUFFER_SERVICE)
{
return MF_E_UNSUPPORTED_SERVICE;
}
else if (m_pSurface == NULL)
{
return E_NOINTERFACE;
}
else
{
return m_pSurface->QueryInterface(riid, ppv);
}
} // Override GetPointer because this class does not manage a system memory buffer.
// The EVR uses the MR_BUFFER_SERVICE service to get the Direct3D surface.
STDMETHODIMP GetPointer(BYTE ** ppBuffer)
{
return E_NOTIMPL;
} private: // Sets the pointer to the Direct3D surface.
void SetSurface(DWORD surfaceId, IDirect3DSurface9 *pSurf)
{
SafeRelease(&m_pSurface); m_pSurface = pSurf;
if (m_pSurface)
{
m_pSurface->AddRef();
} m_dwSurfaceId = surfaceId;
} IDirect3DSurface9 *m_pSurface;
DWORD m_dwSurfaceId;
};

以下代码展示如何在allocator中实现Alloc方法

HRESULT CDecoderAllocator::Alloc()
{
CAutoLock lock(this); HRESULT hr = S_OK; if (m_pDXVA2Service == NULL)
{
return E_UNEXPECTED;
} hr = CBaseAllocator::Alloc(); // If the requirements have not changed, do not reallocate.
if (hr == S_FALSE)
{
return S_OK;
} if (SUCCEEDED(hr))
{
// Free the old resources.
Free(); // Allocate a new array of pointers.
m_ppRTSurfaceArray = new (std::nothrow) IDirect3DSurface9*[m_lCount];
if (m_ppRTSurfaceArray == NULL)
{
hr = E_OUTOFMEMORY;
}
else
{
ZeroMemory(m_ppRTSurfaceArray, sizeof(IDirect3DSurface9*) * m_lCount);
}
} // Allocate the surfaces.
if (SUCCEEDED(hr))
{
hr = m_pDXVA2Service->CreateSurface(
m_dwWidth,
m_dwHeight,
m_lCount - ,
(D3DFORMAT)m_dwFormat,
D3DPOOL_DEFAULT,
,
DXVA2_VideoDecoderRenderTarget,
m_ppRTSurfaceArray,
NULL
);
} if (SUCCEEDED(hr))
{
for (m_lAllocated = ; m_lAllocated < m_lCount; m_lAllocated++)
{
CDecoderSample *pSample = new (std::nothrow) CDecoderSample(this, &hr); if (pSample == NULL)
{
hr = E_OUTOFMEMORY;
break;
}
if (FAILED(hr))
{
break;
}
// Assign the Direct3D surface pointer and the index.
pSample->SetSurface(m_lAllocated, m_ppRTSurfaceArray[m_lAllocated]); // Add to the sample list.
m_lFree.Add(pSample);
}
} if (SUCCEEDED(hr))
{
m_bChanged = FALSE;
}
return hr;
}

以下代码是Free方法:

void CDecoderAllocator::Free()
{
CMediaSample *pSample = NULL; do
{
pSample = m_lFree.RemoveHead();
if (pSample)
{
delete pSample;
}
} while (pSample); if (m_ppRTSurfaceArray)
{
for (long i = ; i < m_lAllocated; i++)
{
SafeRelease(&m_ppRTSurfaceArray[i]);
} delete [] m_ppRTSurfaceArray;
}
m_lAllocated = ;
}

6.解码(Decoding)

调用IDirectXVideoDecoderService::CreateVideoDecoder方法创建解码器设备,该方法返回一个指向解码器设备IDirectXVideoDecoder接口的指针。

对每一帧,调用IDirect3DDeviceManager9::TestDevice来测试设备句柄。如果设备改变了,方法将返回DXVA2_E_NEW_VIDEO_DEVICE。如果这种情况发生,做以下:

1)调用IDirect3DDeviceManager9::CloseDeviceHandle关闭设备句柄

2)释放IDirectXVideoDecoderServiceIDirectXVideoDecoder 指针

3)打开一个新的设备句柄

4)确定一个新的解码器配置,如3所述。

5)创建一个新的解码器设备。

假定设备句柄有效,解码进程以如下步骤工作:

1)调用IDirectXVideoDecoder::BeginFrame

2)做以下,一次或多次:

a)调用IDirectXVideoDecoder::GetBuffer获取一个DXVA解码器缓存

b)填充缓存

c)调用IDirectXVideoDecoder::ReleaseBuffer

3)调用IDirectXVideoDecoder::Execute对该帧执行解码操作

DXVA 2.0解码操作所用数据结构与DXVA 1.0相同。

在每一对BeginFrame/Execute的调用之间,你可能要多次调用GetBuffer,但每种DXVA缓存类型只能一次。如果你对同一种缓存类型调用两次,数据将会覆盖。

调用Execute之后,调用IMemInputPin::Receive把该帧传给视频渲染器,这与软解一样。Receive方法是异步的,它返回之后,解码器可以继续解码下一帧。显示驱动器(display driver)阻止任何解码命令在缓存使用期间覆写缓存。解码器不应该在渲染器释放sample之前重用surface来解码另一帧数据。当渲染器释放sample之后,分配器把sample放回可用sample池中。要获取下一个可用sample,调用CBaseOutputPin::GetDeliveryBuffer,它转而调用IMemAllocator::GetBuffer(原文:which in turn calls IMemAllocator::GetBuffer)。