[Dr. Chris Oakley's home page]
OLE automation, being the object-oriented component of Visual Basic, Microsoft’s script language of choice, is something that programmers in MS-dominated environments will find increasingly hard to ignore. Many may well have been put off by its complexity (creating the objects in C++, at least) and the lack of any really clear documentation. My aim here is to try and redress this balance by providing a short but hopefully informative introduction.
From the user’s point of view, OLE automation is simplicity itself. It is the way that non-programmers can do object-oriented programming. VBA scripts for Excel, for example, are much easier to read and program than the previous macro language, but pack in more functionality. Apart from NeXT Step, stand-alone Visual Basic was for many years the only really fast, effective, bullet-proof screen builder on any platform and this uses OLE automation throughout. It is still hard to beat.
From the programmer’s point of view, though, the picture is not so rosy. Microsoft claim that programming OLE is now "easy" because of the MFC classes they have created to handle it. However if, as I do, you see Microsoft Frustration Classes as the visible manifestation of the fact that MS have failed to grasp that object-oriented programming should be for the service of the people, and not the other way round, then you may want to look one level beneath, which although more fiddly, allows greater control and insight. The proof of the pudding, of course, is that Microsoft, extolling the virtues of MFC to the highest degree, hardly use them at all themselves.
Here are links to the code samples: OleTest.odl; OleAut.h; OleAut.cpp; OleTest.cpp.
Visual basic/OLE automation is a lot like a cut-down version of C++, but with a layer which includes automatic bounds checking and reference counting to prevent VB scripts from crashing the program. It has the additional facility of being able to execute methods as remote procedure calls. Unlike C++, this is built-in and requires no extra machinery.
The C++ declaration
corresponds to
Dim T As ObjectType, TArray(0 To 5) As ObjectType
in VB. "ObjectType" is a user-defined class, which in the VB case is an OLE object. Internally, each object is a pointer to a class derived from IDispatch, an abstract class that contains two principal services: (i) reference counting and (ii) dynamic binding. Let us examine these in turn:
(i) Reference counting
Consider the following piece of C++:
ObjectType *T = new ObjectType, *U;
U = T;
T->DoSomething();
delete T;
U->DoSomething();
Of course this will probably crash the program, and of course you have never done this or the equivalent because you realise the danger. Good. Neither have I. Well, almost never... VB has much less trust in you. VB’s attitude is that you must not crash the program under any circumstances, so it forces every object to carry with it a reference count. No command directly deletes an object, but when a reference is removed, either by the object variable going out of scope or it being assigned to something else, the reference count is decremented and only when it reaches zero is the object deleted. This makes it impossible to access a deleted or invalid object. The equivalent VB
Dim T As New ObjectType, U As ObjectType
Set U = T
T.DoSomething
Set T = Nothing
U.DoSomething
to the above will not give an error because the attempt to delete the object by Set T = Nothing reduces the reference count to one, but does not actually delete it. The object will not be deleted until the variable U goes out of scope. Reference counting is a feature of all OLE objects (OLE automation is merely one strand of the huge and complex web of OLE) and is implemented by deriving all OLE classes from an abstract class called IUnknown:
struct IUnknown
{
virtual HRESULT QueryInterface(const GUID * const riid, void **ppvObject) = 0;
virtual unsigned long AddRef(void) = 0;
virtual unsigned long Release(void) = 0;
};
The last two are the reference-count methods, and can be implemented as follows:
unsigned long DerivedFromIUnknown::AddRef(void) {return m_refs++;}
unsigned long DerivedFromIUnknown::Release(void)
{
if (--m_refs > 0) return m_refs;
delete this; // destructor must be declared as virtual for this to work
return 0; // Note that we do not reference m_refs after the delete this
}
The variable m_refs is an unsigned long data member of the derived class, and is set to one when the object is constructed. Although there may appear to be little point in making these methods virtual as they will always be implemented in the way shown here, there are in fact circumstances in which it is necessary to override the AddRef and Release methods. This is when there is a two-way dependency, i.e. object A can only exist while object B is alive, and vice versa. This means that all references to A in the script may have gone, but the object remains alive because there are still references to B, and B depends on A. To ensure this, an internal call of A::AddRef from B is made. Since the same applies to B with regard to A, we end up with a situation where all the external references to either object have been removed, but the objects are not deleted because there remains one reference on A by B and one reference on B by A. This is known as a reference-counting loop, and is resolved by supplying methods B::AddRef and B::Release that act on the m_refs of the attached A object instead of on itself. The destructor of A then must destroy B.
The first method in IUnknown is for the benefit of OLE itself. The globally-unique-ID (GUID) is a sixteen-byte identifier that identifies the object to OLE. For OLE automation this GUID is mostly an annoying irrelevancy, as the object name and the name of the program or library that services it is quite enough to identify it. It comes into its own, however, for structured storage. Here a file, known as a container, contains an OLE object which has embedded within it further OLE objects, a nesting which may continue to arbitrary depth. The GUID at the head of each of these streams identifies the bytes that follow. The matching of GUIDs to the program/library that services it is a function carried out by the System Registry, and enables an application to open and view OLE documents without needing to know what they are. This is the mechanism by which, for example, you may embed a portion of an Excel spreadsheet in a Word document. The danger of using names and not GUIDs is that a new version of the program might name the streams the same but use a different file format (although, having said that, they could just as easily forget to provide a new GUID, but let’s not dwell on that). QueryInterface only seems to be called in VB when it needs to verify that the object is indeed an OLE automation object (i.e. of type IDispatch), which is when the object is created through VB CreateObject or its equivalent. A typical implementation might be
const GUID IID_DerivedFromIUnknown = {0x00000000, 0x0000, 0x0000, 0x00, 0x00,
0x0, 0x0, 0x00, 0x00, 0x00, 0x01};
// replace these numbers with a unique ID obtained from GUIDGEN.exe
HRESULT DerivedFromIUnknown::QueryInterface(const GUID * const riid, void **ppv)
{
if (riid == IID_IUnknown || riid == IID_IDispatch || riid ==
IID_DerivedFromIUnknown)
{ *ppv = this; AddRef(); return NOERROR; }
*ppv = NULL;
return ResultFromScode(E_NOINTERFACE);
}
To verify that an IUnknown is an IDispatch, OLE passes in the GUID for IDispatch to this method after the object is created, giving "Run-time error ‘430’: Class doesn’t support OLE Automation" if a negative response is given (clearly, it won’t here). Although largely redundant in this situation, QueryInterface is useful when IDispatch is only one of many interfaces that the object supplies. Pointers to the other interfaces are obtained by invoking this method with the relevant GUIDs.
(ii) Dynamic Binding. In OLE automation the type of an object need not be known until run time. The declaration
Dim T As Object
identifies T as an object of unknown type (VB requires this only to support IUnknown, so it need not even be of type IDispatch: however if any processing is done, QueryInterface will be called on it to get an IDispatch interface). Before VB version 4.0, this was the only kind of user-defined object that was possible (although Excel 5.0 came out later with a VBA that did allow typed user-defined objects). This meant that a statement like
x = T.Method(param1, param2)
had to be handled in a way that assumed no prior knowledge of the object T. The string "Method" would be passed to the object, with the question, "do you support this?", using the IDispatch method GetIDsOfNames. If the answer was "yes" an ID would be passed back, which could then be fed into method Invoke with the parameters, if any, to execute the required action. For maximum generality, the parameters would be passed as a variable length array of VARIANT types, and the Invoke method would return a VARIANT. If the method returned a value, then it was up to VB to coerce it into the type of the variable being assigned to. The assignment and querying of "data" members are special cases of this. The assignments
T.Prop = value
value = T.Prop
which set or get the property Prop of the object merely trigger property put and property get methods (since the access to the internal data is only though these methods, data hiding is thus built in, a necessity in any case as the data may reside in a different memory space. This makes statements like
List1.Left = 100
possible, which not only assigns the left hand edge property of the object List1 to 100, but moves the control as well).
Here is the definition of IDispatch:
struct IDispatch : IUnknown
{
virtual HRESULT GetTypeInfoCount(unsigned int *pctinfo) = 0;
virtual HRESULT GetTypeInfo(unsigned int itinfo, unsigned long lcid, ITypeInfo **pptinfo) = 0;
virtual HRESULT GetIDsOfNames(const GUID * const riid, char *rgszNames, unsigned int cNames, unsigned long lcid, long *rgdispid) = 0;
virtual HRESULT Invoke(long dispidMember, const GUID * const riid, unsigned long
lcid, unsigned short wFlags, DISPPARAMS *pdispparams, VARIANT *pvarResult,
EXCEPINFO *pexcepinfo, unsigned int *puArgErr) = 0;
};
The first two methods provide the link to a type library, the equivalent of a header file, to be explored next, which contains information about the object (NB: these methods never seem to be called from VB, which prefers to use type libraries to find out about objects rather than vice versa). The other two provide the dynamic method invocation functionality outlined before. These are most easily implemented using OLE functions DispGetIDsOfNames and DispInvoke which use the definition of the object in the type library to give the appropriate responses.
The type library is the mechanism by which VB gets to know about user-defined objects. It contains the information here that makes possible declarations of the form
Dim T As ObjectType
where ObjectType is a class defined in the type library.
The source file, with extension ODL (Object Description Language) is an extended version of a C++ header file. This is compiled into a binary of type TLB (Type LiBrary) by a program called MkTypLib. From VC++ 4.0 onwards, an ODL file can be included into the project, VC++ 4.0 being aware of how it should be built. Making VB or VBA aware of it is then a simple matter of locating and including the TLB from the Tools/References dialog. If this is successful, the objects will be visible in the Object Browser.
To understand the syntax of an ODL file it is necessary to realise that remote procedure calls are an essential part of OLE automation. The object may reside in a different memory space, or even a different machine. Pointers cannot therefore be sent. Pointers only appear as parameters in order to mark out chunks of memory which are to be sent across the process boundary, or to mark out memory for receiving the reply. Concessions are not made for OLE servers (in-process or DLL servers) that reside in the same memory space. The keywords in and out are applied to each parameter in each method, to indicate whether it is to be sent, received, or both. There are restrictions on types. User defined structures are not permitted (although this is probably more to do with the difficulty of describing them in a VARIANT structure than RPC issues). Arrays must be of type SAFEARRAY which contains bound information. Strings are of type BSTR, which contains a character count. The memory management of both these types must be done by OLE, which provides APIs for the purpose. A pointer to another OLE object, however, may be passed, in which case, if the object is an a separate process, OLE sets up the machinery (a proxy for the object at the client end, and a stub at the server) to enable remote manipulation. More details in appendix A. A type library may also include help strings and context-sensitive references to a help file. It also provides an alternative to BAS files for flat function declarations in DLLs through the module declaration. The parameters of functions in a module section are however subject to the same stringencies as object methods (albeit unnecessarily as the DLL resides in the same memory space. Actually MkTypLib allows module function declarations to include user-defined types, but VB refuses to recognise them).
Here is a sample ODL file. This link shows it unmolested by interspersed comments:
[uuid(01234567-89ab-cdef-0123-0123456789ab), version(1.0), helpstring ("Test OLE automation")]
library OleTest
{
A uuid is needed for the library as whole to enable OLE to keep a reference to the interface in the System Registry. The program GUIDGEN.exe generates these (supposedly) globally unique IDs. When the output TLB is referenced in VB/VBA, the Object Browser will show the library name and help string given here to enable you to identify it. We use this uuid to get a pointer to the type library via LoadRegTypeLib. Two keywords define classes: coclass and interface. (NB there is also dispinterface, but this has been made obsolete by the introduction of the oleautomation attribute on interface). The coclass section is required if the object supports multiple OLE interfaces, or is creatable (more of this later). The interface section, though, is where most of the work is done. It defines the object both to VB and to the C++ code that you are going to write to handle it. MkTypLib can be made to output a C++ header file as well as a TLB, which may be included in the C++ project, and if you look at this you can see how it is interpreting the interface declaration. The first (but not most obvious, thanks to all MS’s macro definitions) thing is that all the methods are declared as virtual. OLE itself overrides them when the object is being manipulated remotely. They are replaced with proxy methods whose purpose is to package or unpackage for arguments for transmitting across the process boundary. OLE code known as a stub, receiving the request, executes the method in the remote process.
importlib("STDOLE.TLB"); // For defn of IDispatch, etc.
[uuid(d0bed0be-d000-beee-d000-d0bed0bed0be), odl, oleautomation, dual,
helpstring("Name/Value pair")]
interface TestObj : IDispatch
{
The attribute oleautomation combined with derivation from IDispatch makes the interface visible to VB. The attribute dual is your promise to VB that it may call your methods directly, and need not use the indirect path involving IDispatch::Invoke. This (static binding) significantly improves efficiency when the VB/VBA implementation chooses to make use of it.
[propget, helpstring("Name of quantity")] HRESULT name([out,retval]BSTR *name);
[propput] HRESULT name([in]BSTR name);
[propget, id(0), helpstring("Value (default property)")] HRESULT
value([out,retval]double *value);
[propput, id(0)] HRESULT value([in]double value);
[helpstring("square of value")] HRESULT square([out,retval]double *square);
};
This declares methods for getting and setting the properties name and value, and a method for getting the square of value. The Object Browser will show two properties and one method (you can make properties read-only by omitting the propput method). The type HRESULT is a standardised OLE error code. You can raise VB errors by returning something other than NOERROR. Unless "On Error" is set, this will interrupt the program execution and pop up an message box. If the final parameter has attribute retval, then it is the return value of the method, or property get, and will not appear in the VB argument list. The attribute id(0) makes the value property the "default" property of the object which (apart from in a Set statement) is used when the object name is used without qualification.
[dllname("OleTest.dll")] module utilities
{
[entry("?NewTestObj@@YGPAUTestObj@@PAGN@Z"), helpstring("Create &
initialise TestObj")]
TestObj * pascal NewTestObj([in]BSTR name, [in]double value);
};
};
The module section here defines a single function that returns a pointer to a TestObj. This creates a new instance of the object which can then be assigned to an ObjectType object variable by a Set command. The "mangled" function name used in the entry attribute was obtained from the MAP file output by the linker. The other ways of creating OLE automation objects are CreateObject and GetObject (which require the type to be the general Object rather than the specific ObjectType) and the New keyword. These trigger a complicated sequence of events involving the System Registry, which is explained in Appendix B. Creating objects with a function call is much simpler, and more flexible as, unlike the other methods, it enables you to pass initialisation data.
The following C++ code can be used to implement this in a DLL.
The class SimpleDispatch implements the methods of the IUnknown and IDispatch method by referring the caller to the relevant portion of the type library. All that a derived class has to do is to implement the method IID_This() which identifies it by its GUID.
#include <windows.h>
extern GUID g_IID_Library;
extern unsigned short g_MajVerNo, g_MinVerNo;
extern LCID g_LCID;
struct VirtualDestructor
{
virtual ~VirtualDestructor();
virtual GUID & IID_This(void) = 0;
};
The destructor is declared as virtual because we want all the destructors to be called even if delete X is applied to an object masquerading as a lower level class. To avoid this occupying space in the main v-table and thereby confusing OLE, the virtual destructor is derived from a different base class. The method for returning a reference to the class ID is put here as well for the same reason.
/* OLE automation classes may safely be derived from this one provided that the IID_Library for the type library is defined and the method IID_This() is implemented for each class ...*/
struct SimpleDispatch : IDispatch, VirtualDestructor
{
unsigned long m_refs;
SimpleDispatch(); // constructor sets ref count to one
// IUnknown methods ...
HRESULT __stdcall QueryInterface(REFIID riid, void **ppv);
unsigned long __stdcall AddRef(void);
unsigned long __stdcall Release(void);
// IDispatch methods ...
HRESULT __stdcall GetTypeInfoCount(unsigned int *pctinfo);
HRESULT __stdcall GetTypeInfo(unsigned int itinfo, LCID lcid, ITypeInfo
**pptinfo);
HRESULT __stdcall GetIDsOfNames(REFIID riid, LPOLESTR *rgszNames, unsigned int
cNames, LCID lcid, DISPID *rgdispid);
HRESULT __stdcall Invoke(DISPID dispidMember, REFIID riid, LCID lcid, unsigned
short wFlags, DISPPARAMS *pdispparams, VARIANT *pvarResult, EXCEPINFO
*pexcepinfo, unsigned int *puArgErr);
};
The source file OleAut.cpp contains the implementation of SimpleDispatch. The implementation of the class defined in the ODL file is in OleTest.cpp
If you want to try this, the project, consisting of the ODL and the two CPP files should be set up so that the target DLL is on WINNT35/system32, or equivalent, or at least on the search path.
Once this is completed, and the type library references are set up correctly, the following VB code should work:
Dim T As TestObj
Set T = NewTestObj("Test 1",15)
Form1.Print T.name; ": "; T.value; " squared is "; T.square
T.name = "Test 2"
T = 16
Form1.Print T.name; ": "; T; " squared is "; T.square
In the last two lines we make use of the fact that value is the default property, enabling T on its own to be used as a shorthand for T.value (except in a Set statement). In an Excel module we might have the following:
Function SquareIt(X as Double) as Double
Dim T As TestObj
Set T = NewTestObj("", X)
SquareIt = T.square
End Function
which gives us a not very efficient way of getting the square of a number.
I hope that the foregoing has demonstrated that building OLE automation objects in C++ does not have to be horrendously difficult. Despite the complexity of OLE, the principles are simple. You should now find Microsoft’s documentation on the subject a little more comprehensible. There are of course many more useful features to be explored (local servers, collections, etc.) and I hope return to these in another document.
If the parameter is designated as "in" in the ODL file, it may be read, but not modified. All memory management should be left to the caller, which means that arrays, strings and OLE objects must not be freed after use. Also, it must not be assumed that the parameter will continue to exist once the method is exited. If the parameter does need to be retained, then it should be copied, using API functions such as SysAllocString for strings and SafeArrayCopy for arrays. AddRef should be called on OLE objects that need to be retained. If the parameter is designated "out", then the recipient of the parameter takes over the memory management. It is responsible for freeing strings and arrays passed in this way, and calling Release on objects. A newly-created object should be passed with reference count set to one. Flout these rules at your peril. They are especially important in the case of passing object references across process boundaries, because the object received/passed is a proxy, an artefact of OLE, with a reference count that is to some extent independent of the original object. OLE uses the information in the type library to work out how to build the proxy, so if a method is not listed here, or is not declared as in the right way, then using it in a remote client will bring disaster. A proxy may also be built through the methods IClassFactory::CreateInstance and IUnknown::QueryInterface. It is important to note that this proxy is just for the IID passed, so if the IID is IID_IUnknown, then only the three IUnknown methods will be callable in the remote client. The IID next to the interface declaration must be used if access to all the methods is required.
Objects of types not necessarily determined at compile time, or all user-defined objects in early versions of VB are created by a call of the form
Dim T As Object
Set T = CreateObject("App.ObjectType")
or, more recently,
Dim T As New ObjectType
This triggers the following sequence of events:
An OLE object may support both 16 and 32 bit interfaces by providing keys for both InprocServer32 and InprocServer (similarly with the other keys). VB can be relied on not to get confused between them.
GetObject will not be explored here. Suffice it to say that it uses non-automation OLE interfaces to read an object from a file, or other persistent storage.
The class GUID (CLSID) in the System Registry is not the same as the GUID applied to the interface declaration earlier in this document. Instead it applies to the declaration coclass, which is used to group related interfaces together in the type library. The QueryInterface method is used to move between these related objects, which may be of any OLE type (i.e. not just IDispatch).