Peter’s Programming Notes: Memory Madness

Memory Madness

Understanding handles and how to use them.

By Peter N Lewis, Perth, Australia

First published in MacTech Magazine

Peter N Lewis is a successful shareware author. He founded Stairways Software Pty Ltd in 1995 and specializes in TCP/IP products but has been known to diversify into other areas.

Contents

Introduction

Using the Memory Manager's handles effectively is a very important part of programming on the Macintosh. Using them badly is also a prime cause of many system crashes. This article describes the difference between memory and resource handles, and operations on them that you should use or avoid. After a brief introduction to the Macintosh memory model, I'll list a bunch of rules or guidelines along with the reasoning behind them [1].

Background

If you know the basics of pointers, handles, resources and so forth, you can skip on to the next section. I'll try to keep this reasonably simple, but if you don't know what a pointer is, you should probably skip this whole article and go buy a beginner's guide to programming.

On the Macintosh, the Memory Manager (the system software which keeps track of all memory) divides the available RAM (including virtual memory) in to chunks called "heap zones". When your application is launched, you are allocated some memory as a single heap zone (the size of this heap is defined by the "Get Info" size you set in the Finder or originally in the SIZE resource of your application). You use Memory Manager routines to allocate or release memory in your application zone, either directly with routines like NewPtr and NewHandle, or indirectly with routines like GetResource. Lots of routines allocate memory in your heap, but this article is going to concentrate only on the Memory Manager and the Resource Manager. Similarly, you can allocate memory and create your own heap zones, and you can allocate memory outside your own heap zone (either in the System Zone or as Temporary Memory), but I'll stick to just your application zone.

The simplest way to allocate memory is to use the NewPtr routine. You tell it how many bytes you want, and it returns a pointer to the memory or nil (NULL) if there was not enough memory available. One problem with pointers is that you often can't resize them. Even if there is lots of free space left in the heap, there might be something allocated in the memory just after the pointer so resizing it quickly becomes impossible. A solution to this problem is to use handles.

A handle is a Memory Manager structure which is basically a pointer to a pointer (the first pointer is a handle, the second one is called a master pointer). You code remembers the handle, and then to resize it, the Memory Manager can free the master pointer and allocate new space anywhere in the heap. You allocate a handle using NewHandle and you release it using DisposeHandle.

Figure 1. A diagram showing a handle

One of the most common times that you will use handles is when dealing with resources. All resources on the Macintosh are allocated as handles. You allocate a resource handle using GetResource and you release it using ReleaseResource or by closing the resource file.

Real Pointers and Handles

The Mac operating system is not known for robust APIs (particularly the APIs that have been with us since 1984). If you pass invalid parameters to the operating system, at best it will do nothing useful, at worst it will crash the system and corrupt people's data. So it is very important to ensure that you are always playing nice with the Memory Manager.

1 Always check whether a memory allocation returns nil.

I know you've heard this a thousand times before, but it can't be repeated enough. NewPtr, NewHandle, GetResource, and so forth can all fail and return nil. If your code blissfully ignores this fact you are going to go down in flames. It is especially important not to pass the nil returned by NewPtr to DisposePtr, that is a sure way to die (passing nil to DisposeHandle on the other hand is perfectly safe). See the section on writing GrowZone procedures for a way of reducing this rule a little bit.

2 Never fake pointers or handles.

A Memory Manager Pointer is more than just any old pointer you get using &variable (or @variable in Pascal). Similarly a Handle is more than just any old pointer to any other pointer. These are called fake pointers or handles; real Pointers and Handles must be allocated by the Memory Manager. If you try something really silly like DisposePtr(&variable), very bad things will happen. Also, the master pointer is not a real Pointer, so never do anything like:

h := NewHandle( 10 );
SetPtrSize( h^, 20 );
3 Never mess with the master pointer.

As described above, a Handle is a pointer in to a block of Memory Manager memory which contains a master pointer pointing to your data. You must never modify the master pointer in any way. For example, never ever do something like this:

h = NewHandle( 10 );
for ( int i = 0; i < 10; i++ ) {
	*(*h)++ = 0;
}

Not only will this corrupt the heap, but it would be much simpler to just use NewHandleClear.

4 Always colour between the lines.

It probably goes without saying (but I'll say it anyway), don't write to any memory outside the Handle or Pointers's allocated space. If you allocate 10 bytes (and it succeeds) make sure you only write to the first ten bytes pointed to by the pointer or master pointer.

5 Always dispose memory exactly once.

Once you dispose of memory (for example, by calling DisposePtr, DisposeHandle, ReleaseResource or closing the resource file), the pointer or handle you had is no longer valid. You must not use it for any purpose, and especially you must not dispose of it again, doing so will corrupt the heap.

Locking and Purging

The important feature about handles is the ability for the data to move around in memory so that it can be resized. Unfortunately, this also introduces a lot of possible problems since the Memory Manager can move the memory any time it is called, directly or indirectly, by you or anyone else (your handle will stay valid, but the master pointer will change).

6 Always lock your handles when you dereference.

You must lock a handle (using HLock) any time you dereference a handle (that is, any time you remember the master pointer in another variable or pass it to another procedure or use a "with h^ do" statement in Pascal) unless you are absolutely sure you are not going to call any routines that may move memory.

There is a list of routines that may move memory, which would seem to imply that there is a list of routines that must not move memory. But since not everyone has read either list, and since many of those who have not read them, have spent their time more productively by writing System Extensions that patch routines that are not suppose to move memory so that they now do, about the best course of action is to assume that every system routine that you call may move memory. The only exceptions I would make to this are routines called at interrupt level (since you are not allowed to call any Memory Manager routines at interrupt level), and BlockMove and BlockMoveData.

It is safer to lock and then unlock a handle than it is to find out the hard way that a routine sometimes moves memory - these bugs are basically impossible to track down. But on the other hand you don't have to go completely insane either - if you don't call any routines at all, then the memory can not move. So it is perfectly safe to scan a handle looking for a linefeed for example.

7 Use HGetState/HLock/HSetState instead of HLock/HUnlock.

Imagine you write a procedure that needs to lock a handle, for example:

procedure DontDemonstrateSetupData( data: Handle );
begin
	HLock( data );
	DoStuff( data^, GetHandleSize( data ) );
	HUnlock( data );
end;

If you now call this routine after locking a handle, it will cheerfully unlock it for you, with potentially terrifying results. Instead of that, you should use HGetState to preserve and restore the state.

procedure DemonstrateInitializeData( data: Handle );
	var
		state: SignedByte;
begin
	state := HGetState( data );
	HLock( data );
	DoStuff( data^, GetHandleSize( data ) );
	HSetState( data, state );
end;

An alternative approach is to assume all handles are unlocked, and any routine can unlock a handle. So after any call to a procedure you have to relock and redereference the handle.

8 Watch out for purgeable resources.

If you tell the Memory Manager that a handle is purgeable (either using HPurge or setting a resource's purge bit using ResEdit), then the memory may be released any time it could be moved (if you lock the handle it will not be purged so there is generally no need to call both HLock and HNoPurge). The normal case for using purgable handles is when you make a resource handle purgeable, then you can load the resource using GetResource (or GetIndString or whatever) and not bother releasing it. The Resource Manager will release it automatically if the resource file is closed, and the Memory Manager will release it if you run low on memory.

The best way to deal with purgable resources is to always call GetResource when you need the resource, and then use either HGetState/HLock/HSetState (as described in Item 7) or HGetState/HNoPurge/HSetState to ensure that the data is not released until after you are finished with it. I would recommend that you never use HNoPurge and HPurge, instead a resource handle should always remain either purgable or non-purgable, in the former case you should be careful to always reload the resource (using GetResource, or if you remembered the handle, using LoadResource) and to lock it while it is in use.

Memory vs Resource Handles

There are subtle but important differences between a memory handle (one you get by calling NewHandle) and a resource handle (that you get by calling GetResource).

9 Match NewHandle/DisposeHandle and GetResource/ReleaseResource.

You must make sure you always release memory handles using DisposeHandle and release resource handles using ReleaseResource. This is because the resource manager keeps extra information related to resource handles, so you must use ReleaseResource to ensure the resource manager knows that the resource handle is no longer valid.

10 Always know whether a handle is a resource handle or a memory handle.

One consequence of Item 9 is that you must always know whether a handle is a resource handle or a memory handle in order to dispose of it. For instance, you should never have code that looks like this:

h = GetResource( 'STR ', 128 );
if ( h == NULL ) {
	h = NewString( "\pHello" );
}

At the end of this sequence, you don't know whether h is a resource handle or a memory handle, so how can you dispose it properly? The simple solution in this case is to ensure that at the end of the snippet we are left with a memory handle no matter where we got the memory from. You can do this by using DetachResource to change the resource handle returned by GetResource in to a memory handle:

h = GetResource( 'STR ', 128 );
if ( h == NULL ) {
	h = NewString( "\pHello" );
} else {
	DetachResource( h );
}

We can now dispose the memory using DisposeHandle.

It is possible to determine whether a handle is a resource handle or a memory handle (isresource := (HomeResFile( h ) <> -1)), but in general you should know what kind of handle you are dealing with. I suppose one solution would be to write a routine like this:

procedure DisposeAnything( var h: Handle );
begin
	if h <> nil then begin
		if HomeResFile( h ) <> -1 then begin
			ReleaseResource( h );
		end else begin
			DisposeHandle( h );
		end;
		h := nil;
	end;
end;

However this is not a particularly efficient solution since HomeResFile probably takes a fair amount of time to confirm whether a handle comes from a resource file or not.

11 Convert between resource and memory handles where appropriate.

As seen in Item 10, it is possible to convert a resource handle in to a memory handle using DetachResource. You can also go in the other direction by adding a handle to a resource file using AddResource. Releasing the memory is not the only time you have to ensure that you know what kind of handle you have, you also cannot add a resource handle to a resource fork, so you will have to use DetachResource before calling AddResource, like this:

h = Get1Resource( 'STR ', 128 );
err = ResError();
if ( h != NULL ) {
	DetachResource( h );
	AddResource( h, 'STR ', 129, "\pteststring" );
	err = ResError();
}
12 CloseResFile releases all resources.

When you close a resource file, all the resources are automatically released. This means that any resource handles you have that came from that resource file are now invalid so you must not use them (including not calling ReleaseResource or DisposeHandle on them). If you want to keep a resource handle around after you close the file, you must turn it in to a memory handle by calling DetachResource. So for example:

result := nil;
resfile := FSpOpenResFile( spec, fsRdPerm );
if resfile <> -1 then begin
	str1 := GetResource( 'STR ', 128 );
	str2 := GetResource( 'STR ', 129 );
	if (str1 <> nil) & (str2 <> nil) then begin
		if length(str1^^) > length(str2^^) then begin
			result := str1;
		end else begin
			result := str2;
		end;
		DetachResource( result );
	end;
	CloseResFile( resfile );
end;

At the end of this code, data is either nil or, assuming both string resources exist, data is a memory handle containing the longer of the two strings. There are a bunch of things to notice in this code. First, it defends against failing to open the resource file or failing to get the string resources. Next, it lets the Resource Manager release the resource handles (so for example, if str1 is nil, str2 will still be released by the Resource Manager when the file is closed). Also, the code is careful to detach the resource handle we wish to keep past the CloseResFile so that it is not automatically released.

Cool Memory Manager Routines

The Memory Manager provides a lot of neat routines for working with handles. Most of them you can duplicate yourself, but why waste time and introduce potential bugs when you can just get the OS to do it for you?

13 Use PtrAndHand and friends.

PtrAndHand appends a chunk of memory to the end of a handle. Similarly, HandAndHand appends a handle to another handle. PtrToXHand replaces a handle's data with new data. In all cases, the source data is unaffected and the destination handle (which must already be a valid handle) is resized appropriately and the new data is copied in. If the destination handle was a resource handle, it remains a resource handle (although it will only be written back if you call ChangedResource).

PtrToHand and HandToHand allocate a new handle and initialize its size and contents based on the input values. The resulting handle is always a memory handle even if the source handle was a resource handle.

Another nice thing about these calls is that they return OSErrs so you don't have to call MemError.

PtrAndHand is really useful for building a handle from a sequence of input data (for example, you might have a handle to a text log, and you might append new lines to the log by using PtrAndHand).

14 Use Munger where appropriate.

Munger is my favourite Macintosh routine. It is a true power-geek tool. If you can master this routine you can amaze your friends with astonishing feats. Munger looks pretty complicated (it takes a handle, two pointers and three longs), and it does a bunch of almost unrelated things depending on the exact parameters you pass it. But once you get the hang of it, it is fairly easy to use.

function Munger( h:Handle; offset: longint; ptr1: Ptr; len1: longint; ptr2:Ptr; len2: 
longint):longint;
pascal long Munger(Handle h, long offset, const void *ptr1, long len1, const void 
*ptr2, long len2);

Basically, what Munger (which rhymes with plunger according to Inside Mac) does is search and optionally modify a handle (the first parameter). The second parameter is an offset to start searching from (normally this is zero to start from the beginning of the handle). The next two parameters (ptr1&len1) describe the data to search for (it is a byte search, so it is case sensitive, and world-script ignorant so you probably cannot use it for world-script text). One trick with the search parameters is that if you pass it nil for the pointer it will act as if it finds a match immediately (I'll give you an example below, so don't panic if you didn't quite follow that). The final two parameters (ptr2&len2) describe the replacement data. The matched data will be replaced by with this second chunk of memory, assuming that the pointer is not nil (one trick here is when you want to delete the found data, you need to pass a non-nil value with a zero size - I normally use the address of the source handle but any non-nil value will do). The return value of Munger is either the offset of the matched data, or -1 if no match was found. Munger also sets MemError if it fails to resize the handle, so if you are inserting data you must check for an error. Ok, you are probably lost by now so lets look at some examples.

The first example, we will just use Munger to insert some text.

h = (Handle) NewString( "\pHello World!" );
(void) Munger( h, 7, nil, 0, (Ptr) "Cruel ", 6 );
err = MemError();

What this does is it matches zero bytes at offset seven (six characters plus the length byte) in the handle, and then replaces those zero bytes with six bytes of "Cruel " (It is customary where I come from to make your first program in any new language print "Hello Cruel World!" - I'm not sure what that says about the people I hang out with but I expect most programmers can see the logic in it ;-). Don't forget to reset the pascal string length with (**h) = GetHandleSize( h ) - 1;

Alternatively, if you're having a good day, you might prefer to remove the Cruel like this:

h = (Handle) NewString( "\pHello Cruel World!" );
(void) Munger( h, 7, nil, 6, &h, 0 );

Starting from seven bytes in to the handle, this matches any six bytes and replaces them with zero bytes (starting from &h, not that that matters much, all that matters is that &h != NULL). We don't need to check MemError after Munger because we are reducing the size of the handle, and the memory manager pretty much has to be able to cope with that (of course, we should have tested the handle returned by NewString to ensure that succeeded!).

Alternatively, you might be having a really really good day, and want to replace "Cruel " with "Wonderful " like this:

h = (Handle) NewString( "\pHello Cruel World!" );
where = Munger( h, 0, (Ptr) "Cruel ", 6, (Ptr) "Wonderful ", 10 );
err = MemError();

This searches for the six bytes "Cruel " and replaces them with the ten bytes "Wonderful ". It returns the offset where "Cruel " was found (in this case, it will return seven).

If you just want to search for where the "Cruel " appears, you can do this:

h = (Handle) NewString( "\pHello Cruel World!" );
where = Munger( h, 0, (Ptr) "Cruel ", 6, nil, 0 );
if ( where >= 0 ) {
	printf( "Found at offset %ld\n", where );
}
The handle is not modified because ptr2 is NULL.

GrowZones

The Macintosh system becomes very fragile when you run out of application memory. It is also very tedious to have to guard every single tiny memory allocation (including creating new objects and new empty handles, and so forth). One way to reduce the chance of bad things happening, and to let you relax a little bit, is to install a GrowZone routine. The Memory Manager calls this routine when it cannot meet a request for memory. It is very easy to write a simple GrowZone procedure by allocating some spare memory when your program starts up (say 20k), then when the GrowZone routine is called, you simply deallocate this memory in the hope that the Memory Manager will now be able to meet the request. In the event loop you can check whether the memory has been released, and if so (and if you cannot reallocate your spare memory) then you can display an alert and quit gracefully. Also, when you make a large memory allocation, check that the allocation succeeded (ie, the handle is not nil), but also check that the GrowZone memory is still available. If the GrowZone has fired and released its memory, you should dispose the handle you just created and pretend that the memory allocation failed.

Conclusion

All of the above should be considered as guidelines - you should follow them unless you have really good reasons not to. Even if you follow all of them, you will sometimes run into trouble with some sort of heap corruption or memory problem. You are not completely on your own when this happens, there are tools that can help you. For example, the Debugging Modern Memory Manager can detect some of the problems caused by ignoring (or forgetting) the guidelines I've listed, and Even Better Bus Error can detect writes to nil. Before shipping any program, you should install the DMM and EBBE and stress test your program - if you don't, you can be sure some of your users will, and you and they will both be much happier if you find these problems before you ship your program.

References

  1. Effective C++ by Scott Meyers. I use a very similar format to the one Scott uses in this excellent book.
  2. NIM Memory.