[转]Marshaling a SAFEARRAY of Managed Structures by P/Invoke Part 2.

[转]Marshaling a SAFEARRAY of Managed Structures by P/Invoke Part 2.

1. Introduction.

1.1 In part 1 of this series of articles, I explained how managed arrays may be transferred to unmanaged code as a SAFEARRAY.

1.2 In part 1, the SAFEARRAY was passed to unmanaged code as an “in” (read-only) parameter.

1.3 Here in part 2, I shall continue the discussion this time with the aim of showing how to return a SAFEARRAY from unmanaged code to managed code as an “out” parameter.

1.4 As usual, throughout this article, we shall be working only with single-dimensional managed arrays and SAFEARRAYs.

2. TestStructure, CSConsoleApp.tlb Type Library and UnmanagedDll.DLL

2.1 We shall be using the same TestStructure struct that we have developed in part 1.

2.2 We shall also continue to use the CSConsoleApp.tlb type library that was produced from the CSConsoleApp console application solution that was presented in part 1.

2.3 We shall augment UnmanagedDll.dll with some helper functions as well as a new exported API to be called in the CSConsoleApp console application.

3. Unmanaged API that returns a SAFEARRAY of TestStructure.

3.1 The new exported function that we will expose to C# takes as parameter a double pointer to a SAFEARRAY of TestStructure structures.

3.2 This double pointer will be used to return a SAFEARRAY of TestStructure UDTs to the caller. In our case, the caller will be the interop marshaler which will transform the SAFEARRAY into a managed array.

3.3 The following is a full code listing of this function :

#import "CSConsoleApp.tlb" raw_interfaces_only no_implementation
using namespace CSConsoleApp;
#include <vector>
#include <algorithm>
#include <functional>

extern "C" __declspec(dllexport) void __stdcall GetArrayOfTestStructure
(
  /*[out]*/ SAFEARRAY** ppSafeArrayReceiver
)
{
	// Use an instance of the STL vector class to store
	// instances of our structure.
	std::vector<TestStructure> vecTestStructure;
	TestStructure test_structure;

	// Set values to "test_structure".
	test_structure.m_integer = 0;
	test_structure.m_double = 0;
	test_structure.m_string = ::SysAllocString(L"Hello World");
	// When "test_structure" is inserted into
	// the vector, a copy of "test_structure"
	// is created and is then pushed into the
	// vector.
	vecTestStructure.push_back(test_structure);

	test_structure.m_integer = 1;
	test_structure.m_double = 1.0;
	test_structure.m_string = ::SysAllocString(L"Hello World");
	vecTestStructure.push_back(test_structure);

	test_structure.m_integer = 2;
	test_structure.m_double = 2.0;
	test_structure.m_string = ::SysAllocString(L"Hello World");
	vecTestStructure.push_back(test_structure);

	test_structure.m_integer = 3;
	test_structure.m_double = 3.0;
	test_structure.m_string = ::SysAllocString(L"Hello World");
	vecTestStructure.push_back(test_structure);

	HRESULT hrRet;
	IRecordInfoPtr spIRecordInfoTestStructure = NULL;

	hrRet = GetIRecordType
	(
		TEXT("CSConsoleApp.tlb"),
		__uuidof(TestStructure),
		&spIRecordInfoTestStructure
	);

	CreateSafeArrayEx<TestStructure, VT_RECORD>
	(
		(TestStructure*)&(vecTestStructure[0]),
		vecTestStructure.size(),
		(PVOID)spIRecordInfoTestStructure,
		*ppSafeArrayReceiver
	);

	// Before the end of this function, each of the TestStructure structs
	// inside vecTestStructure must be cleared.
	// This is because each
	std::for_each
	(
		vecTestStructure.begin(),
		vecTestStructure.end(),
		clear_test_structure(spIRecordInfoTestStructure)
	);
}

The following is a synopsis of this function :

  • This function uses the STL vector to temporarily store TestStructure structs. Hence there is a need to #include <vector>.
  • It repeatedly uses one single instance of the TestStructure struct (i.e. “test_structure”) for value setting and inserting into the vector of TestStructure (i.e. “vecTestStructure”).
  • Eventually “vecTestStructure” will contain 3 TestStructure structs.
  • A helper function GetIRecordType() is used to obtain a reference to the IRecordInfo object associated with the TestStructure UDT (see section 4 for more details).
  • Another helper function CreateSafeArrayEx<>() is used to generate a SAFEARRAY of TestStructure from an input vector (see section 5 for more details).
  • Finally, before the function completes, we need to loop through the elements of the vecTestStructure vector and perform a clearing of each TestStructure struct contained in it. This will be explained in greater detail in section 8.

4. The GetIRecordType() Helpet Function.

4.1 In this section, we shall explore the GetIRecordType() helper function. Full source is listed below :

HRESULT GetIRecordType
(
  LPCTSTR lpszTypeLibraryPath,
  REFGUID refguid,
  IRecordInfo** ppIRecordInfoReceiver
)
{
	_bstr_t	bstTypeLibraryPath = lpszTypeLibraryPath;
	ITypeLib* pTypeLib = NULL;
	ITypeInfo* pTypeInfo = NULL;
	HRESULT hrRet = S_OK;

	*ppIRecordInfoReceiver = NULL;  // Initialize receiver.
	hrRet = LoadTypeLib((const OLECHAR FAR*)bstTypeLibraryPath, &pTypeLib);

	if (SUCCEEDED(hrRet))
	{
		if (pTypeLib)
		{
			hrRet = pTypeLib -> GetTypeInfoOfGuid(refguid, &pTypeInfo);
			pTypeLib->Release();
			pTypeLib = NULL;
		}

		if (pTypeInfo)
		{
			hrRet = GetRecordInfoFromTypeInfo(pTypeInfo, ppIRecordInfoReceiver);
			pTypeInfo->Release();
			pTypeInfo = NULL;
		}
	}

	return hrRet;
}

The following is a general synopsis :

  • It uses the LoadTypeLib() API to load a required type library.
  • If the type library loading is successful, a pointer to the ITypeLib interface associated with the type library will be returned.
  • GetIRecordType() will then use the ITypeLib::GetTypeInfoOfGuid() method to obtain a reference to the ITypeInfo interface of an object, contained within the type library, which is associated with a GUID.
  • Then, assuming that the ITypeInfo object just acquired is that of a UDT, GetRecordInfoFromTypeInfo() is called to obtain a reference to a IRecordInfo interface which is associated with the UDT.

4.2 It is a pretty straightforward function which can be repeatedly used to obtain an IRecordInfo interface associated with any UDT from any type library.

4.3 Just ensure that the path to the type library is specified correctly.

5. The CreateSafeArrayEx<>() Helper Function.

5.1 In this section, we shall explore the CreateSafeArrayEx<>() templated helper function. Full source is listed below :

template <class T, VARTYPE v>
void CreateSafeArrayEx
(
  T* lpT,
  ULONG ulSize,
  PVOID pvExtraInfo,
  SAFEARRAY*& pSafeArrayReceiver
)
{
	HRESULT hrRetTemp = S_OK;
	SAFEARRAYBOUND rgsabound[1];
	ULONG ulIndex = 0;
	long lRet = 0;

	// Initialise receiver.
	pSafeArrayReceiver = NULL;

	if (lpT)
	{
		rgsabound[0].lLbound = 0;
		rgsabound[0].cElements = ulSize;

		pSafeArrayReceiver = (SAFEARRAY*)SafeArrayCreateEx
		(
			(VARTYPE)v,
			(unsigned int)1,
			(SAFEARRAYBOUND*)rgsabound,
			(PVOID)pvExtraInfo
		);
	}

	if (pSafeArrayReceiver == NULL)
	{
		// If not able to create SafeArray,
		// exit immediately.
		return;
	}

	for (ulIndex = 0; ulIndex < ulSize; ulIndex++)
	{
		long lIndexVector[1];

		lIndexVector[0] = ulIndex;

		SafeArrayPutElement
		(
			(SAFEARRAY*)pSafeArrayReceiver,
			(long*)lIndexVector,
			(void*)(&(lpT[ulIndex]))
		);
	}

	return;
}

The following is a general synopsis of this function :

  • The purpose of this function is to create a SAFEARRAY of a specific Variant Type which is specifiable by template parameter “v”.
  • The function also takes as parameter a pointer to an array of type “T” which is a template parameter.
  • “T” must, of course by compatible with “v”.
  • A general pointer to some object (“pvExtraInfo”) can be provided by the caller. This “pvExtraInfo” parameter will be passed to the SafeArrayCreateEx() API. Please refer to SafeArrayCreateEx() for more information on “pvExtraInfo”.
  • The CreateSafeArrayEx<>() function essentially uses the SafeArrayCreateEx() API to create the SAFEARRAY and then loops through the input array of type “T”.
  • Each element of the “T” array is inserted into the SAFEARRAY via SafeArrayPutElement().

5.2 The CreateSafeArrayEx<>() helper function can be used to great effect when combined with STL vectors. This is demonstrated in the GetArrayOfTestStructure() function of section 3.

6. Example C# Call to GetArrayOfTestStructure().

6.1 The following shows how the GetArrayOfTestStructure() API should be declared in a C# program :

[DllImport("UnmanagedDll.dll", CallingConvention = CallingConvention.StdCall)]
private static extern void GetArrayOfTestStructure
(
  [Out] [MarshalAs(UnmanagedType.SafeArray, SafeArraySubType = VarEnum.VT_RECORD)]
  out TestStructure[] SafeArrayReceiver
);

Now note the use of the various attributes :

  • The OutAttribute is used to indicate to the interop marshaler that the “SafeArrayReceiver” parameter is to be marshaled single-directionally “out of” the function.
  • This also indicates to the interop marshaler that whatever form the counterpart parameter (i.e. the parameter of the unmanaged function) takes when it is returned from the unmanaged function, it will owned by the caller.
  • The way the MarshalAsAttribute is specified as well as the use of the OutAttribute indicate to the interop marshaler that the counterpart parameter will take the form of a double pointer to a SAFEARRAY.
  • Finally, the “SafeArraySubType” field for the MarshalAsAttribute, being equal to “VarEnum.VT_RECORD”, indicates to the interop marshaler that the SAFEARRAY will contain UDTs.
  • And since the “SafeArrayReceiver” parameter is typed as an array of TestStructure, the UDT must be the equivalent of the TestStructure.

6.2 The following is a sample C# function that makes a call to GetArrayOfTestStructure() :

static void DoTest_GetArrayOfTestStructure()
{
    TestStructure[] TestStructureArrayReceiver;

    GetArrayOfTestStructure(out TestStructureArrayReceiver);

    for (int i = 0; i < TestStructureArrayReceiver.Length; i++)
    {
        Console.WriteLine("TestStructureArrayReceiver[{0}].m_integer : [{1}]",
                           i, TestStructureArrayReceiver[i].m_integer);
        Console.WriteLine("TestStructureArrayReceiver[{0}].m_double : [{1}]",
                           i, TestStructureArrayReceiver[i].m_double);
        Console.WriteLine("TestStructureArrayReceiver[{0}].m_string : [{1:S}]",
                           i, TestStructureArrayReceiver[i].m_string);
    }
}

The following is a synopsis :

  • An array of TestStructure (i.e. “TestStructureArrayReceiver”) is defined but not instantiated.
  • This is fine because the array will be passed as an “out” parameter to GetArrayOfTestStructure() and so the understanding is that it will be instantiated when GetArrayOfTestStructure() returns.
  • After GetArrayOfTestStructure() is called and returned, a loop is performed wherein the field values of each of TestStructure elements of the array is displayed.

6.3 The following is what will happen under the covers :

  • When GetArrayOfTestStructure() is called, the interop marshaler will internally prepare a pointer to a SAFEARRAY, “pSafeArrayReceiver”, say.
  • The interop marshaler will then make a call to GetArrayOfTestStructure() and pass the address of “pSafeArrayReceiver” as parameter.
  • The GetArrayOfTestStructure() function will instantiate an actual SAFEARRAY and make “pSafeArrayReceiver” point to it.
  • When GetArrayOfTestStructure() returns, the interop marshaler will use the SAFEARRAY pointed to by “pSafeArrayReceiver” to internally create a managed array.
  • Various SAFEARRAY APIs will be used to extract dimension and bounds information from the SAFEARRAY.
  • When the managed array of TestStructure is finally created, the SAFEARRAY pointed to by “pSafeArrayReceiver” will be destroyed. Each TestStructure contained inside the SAFEARRAY will be destroyed by calling on the RecordDestroy() method using the IRecordInfo pointer which is already contained within the SAFEARRAY.

6.4 At runtime, the C# function DoTest_GetArrayOfTestStructure() will produce the following expected output :

TestStructureArrayReceiver[0].m_integer : [0]
TestStructureArrayReceiver[0].m_double : [0]
TestStructureArrayReceiver[0].m_string : [Hello World]
TestStructureArrayReceiver[1].m_integer : [1]
TestStructureArrayReceiver[1].m_double : [1]
TestStructureArrayReceiver[1].m_string : [Hello World]
TestStructureArrayReceiver[2].m_integer : [2]
TestStructureArrayReceiver[2].m_double : [2]
TestStructureArrayReceiver[2].m_string : [Hello World]
TestStructureArrayReceiver[3].m_integer : [3]
TestStructureArrayReceiver[3].m_double : [3]
TestStructureArrayReceiver[3].m_string : [Hello World]

7. Memory Management Issues in GetArrayOfTestStructure().

7.1 We will now begin to examine the various memory management issues that are significant in the exported GetArrayOfTestStructure() function.

7.2 Memory leakage problems exist but are not obvious. I shall go through them as thoroughly as possible in the sections that follow.

8. Storage of TestStructure Inside an STL vector.

8.1 The STL vector, just like the SAFEARRAY, employs “copy-semantics”. This means that when an item (most notably an instance of a class or a structure) is inserted into the vector, a complete copy of the item is created and then stored inside the vector.

8.2 For instances of C++ classes, this would result in the invokation of the class’ copy-constructor to generate a copy. If no copy-constructor is available, a by-value copy of the object is made by default.

8.3 This has important consequences for the TestStructure struct. Note that the most important field of TestStructure is the “m_string” field which is a BSTR. This field is instantiated using the SysAllocString() API.

8.4 When inserted into the vecTestStructure vector, a complete copy of the structure is made but TestStructure has no copy constructors. Hence a value-for-value clone of the structure is made.

8.5 What does this mean for the “m_string” BSTR member ? A BSTR is a pointer and so the pointer’s value (an address) gets copied into the structure which is inserted into the vector. Observe the following code :

// Set values to "test_structure".
test_structure.m_integer = 0;
test_structure.m_double = 0;
test_structure.m_string = ::SysAllocString(L"Hello World");
// When "test_structure" is inserted into
// the vector, a copy of "test_structure"
// is created and is then pushed into the
// vector.
vecTestStructure.push_back(test_structure);

First “test_structure.m_string” is assigned a BSTR with value “Hello World”. Then when “test_structure” is inserted into “vecTestStructure”, a value-for-value copy of “test_structure” is created. The “m_string” field of both the original “test_structure” and the copy will both point to the same BSTR.

8.6 Subsequent parts of the GetArrayOfTestStructure() code re-uses “test_structure” and re-assigns its “m_string” field. This is OK because the structure will again be inserted into the vector and so a copy of the latest BSTR will be stored.

8.7 However when CreateSafeArrayEx<>() is called later to create a SAFEARRAY from the vecTestStructure vector it will be a different story as far as the “m_string” field is concerned. We will cover this issue in the next section.

8.8 Meantime, because all eventually allocated BSTRs will be stored inside “vecTestStructure”, before the end of the GetArrayOfTestStructure() function, these must be freed.

8.9 This is done by iterating through the elements of the “vecTestStructure” vector and calling IRecordInfo::RecordClear() on each TestStructure struct. Doing so will cause SysFreeString to be called on the “m_string” field.

8.10 In GetArrayOfTestStructure(), the iteration is done using the std::for_each() function and using the clear_test_structure unary functor :

std::for_each
(
	vecTestStructure.begin(),
	vecTestStructure.end(),
	clear_test_structure(spIRecordInfoTestStructure)
);

8.11 The clear_test_structure function object is listed below :

struct clear_test_structure : public std::unary_function<TestStructure, void>
{
  // Constructor.
  clear_test_structure(IRecordInfoPtr& refIRecordInfoPtr) :
    m_refIRecordInfoPtr(refIRecordInfoPtr)
  {
  }

  // Copy constructor.
  clear_test_structure(const clear_test_structure& rhs) :
    m_refIRecordInfoPtr(rhs.m_refIRecordInfoPtr)
  {
  }

  void operator () (TestStructure& test_structure)
  {
	m_refIRecordInfoPtr -> RecordClear((PVOID)&test_structure);
  }

  IRecordInfoPtr& m_refIRecordInfoPtr;
};

9. Copy-Semantics of a SAFEARRAY.

9.1 The SAFEARRAY also adopts copy-semantics. However, it employs a more intelligent form of copy-semantics.

9.2 When used to contain UDTs, the SAFEARRAY will make a complete copy of any BSTRs or VARIANTs contained within the UDT. If the UDT contains any pointers to interfaces, AddRef() will be called on these pointers.

9.3 This being the case and focusing on the TestStructure struct, the original struct that gets inserted into a SAFEARRAY and the copy within the SAFEARRAY will not share the same “m_string” field. Each will contain its own copy of “m_string”.

9.4 Then, when the SAFEARRAY is destroyed, all TestStructure structs contained within it will be destroyed. This includes the “m_string” fields.

9.5 Hence, with reference to section 7, when we iterate through the TestStructure’s of “vecTestStructure” and call IRecordInfo::RecordClear() on each, the corresponding counterpart TestStructure’s in the SAFEARRAY will not be affected.

10. Ownership of the Returned SAFEARRAY.

10.1 Now the GetArrayOfTestStructure() function returns the SAFEARRAY to its caller via an “out” parameter.

10.2 This means that it is the caller which will own the SAFEARRAY. Hence the caller is responsible for its destruction after it has finished using it.

10.3 In the case of our example, the owner of the returned SAFEARRAY will be the interop marshaler of the CLR.

11. In Conclusion.

11.1 In part 2, I have demonstrated returning a SAFEARRAY from an unmanaged function to managed code.

11.2 I have explained how the SAFEARRAY is used by the interop marshaler to create an equivalent managed array. With memory ownership, the interop marshaler is able to destroy the returned SAFEARRAY thus ensuring no memory leakage.

11.3 Various memory management issues are studied (sections 7 through 8) and the storage mechanism of SAFEARRAYs explained (section 9).

11.4 Finally, we learnt how, with memory ownership, the interop marshaler is at liberty to destroy returned SAFEARRAYs.

11.5 In the next installment of this series of articles, I shall demonstrate how to pass a SAFEARRAY of UDTs to and from an unmanaged function two-ways (i.e. as both an “in” and “out” parameter).