THE GETSPACE DYNAMIC STORAGE ALLOCATION PACKAGE The getspace package is a dynamic storage allocation package designed for use in C programs as a front end for malloc/free. It provides a portable allocator whose usage is less error prone and more debug friendly than using malloc/free. The package uses a best fit allocator sitting on top of the system allocation package. The package allocator is fairly space and time efficient; however it is not expected to be more efficient than the system allocator. WHY USE GETSPACE? The getspace package offers specific benefits; it makes memory allocation errors less likely to occur, less dangerous if they do occur, and easier to recognize and debug if they should occur. The package was designed to be used in production code. This is important; many software validation packages are too expensive to include in production code and are only used during development in test suites. This is backwards; production code is where you really want safety checks. The package provides a potentially better approach to error handling. C code that calls malloc and free directly must (should) test all malloc returns and take some action on a null return, which usually consists of writing an error message to stderr and exiting. These test/action sequences litter source code, are almost always untested and untestable, and militate against having a coherent error response policy. The getspace package is instrumented and includes embedded error analysis and error handling code. WHERE CAN I GET IT? This distribution has five files, this readme file (readme.txt), and four source files - getspace.c, trace.c, getspace.h, and trace.h. They can be found at: http://richardhartersworld.com/cri_a/source_code/getspace/ (The trace files are a utility package for recording the sequence of function calls. If you don't want to use them, edit out the calls to trace and don't include trace.h in getspace.c.) HOW DO I USE IT? Compile getspace.c and trace.c and include the compiled object files in your build. Include getspace.h in any file that contains code that will use the getspace allocator. There are two major entry points into the package, getspace and freespace. Their prototypes are: void * getspace (size_t request, char * id); int freespace(void * address, char * id); Getspace gets space. The first argument is the size of the block of space needed, and the the second is a pointer to an identification string. It returns an aligned pointer (see caveats below) to a block at least 'request' bytes in size. You do not need to check the pointer returned of getspace; either getspace returns a valid pointer or it calls an error handler and exits. Freespace frees space allocated by getspace. The first argument is the address of the space being returned, and the second is an identification string. It returns a success/failure flag that can be ignored except in special situations. The package tags each block of space that it manages with identification strings. You can use any string you like; however the LINELOC macro defined in getspace.his a good choice. It expands to the file name and line number where it appears. One big warning: Malloc/free and getspace/freespace are different allocators. Don't mix them up; if you use getspace to get some space don't try to free it with free. Similarly if you get some space from malloc don't try to free it with freespace. That's about all you need to know for casual usage. There is a section below for power users. WHY NOT STICK WITH MALLOC/FREE? Malloc/free based storage allocation is notoriously fragile. The big problem areas are buffer overwrites and memory leaks. Diagnosing these problems in code can be difficult. Memory overwrites tend to create mystery bugs, i.e., bugs that have no apparent connection with the code where they manifest. A memory overwrite in a buffer can create a storage fault in another buffer that is not logically connected to the storage. Worse, these bugs can lie dormant for a long time before manifesting. Worse still, many malloc/free implementations store information about storage blocks adjacent to the blocks themselves. These allocators are easily corrupted, and there is no guarantee that the allocator on your system isn't one of them. Memory leaks are like high blood pressure; they are silent killers. Not only are almost all implementations of malloc/free fragile, they are also uninformative. They do not provide substantial reports that let you diagnose errors and analyze memory usage patterns. In a word, they are uninstrumented. WHY NOT VALGRIND? There are a number of programs that are used to instrument and debug C programs. Valgrind is a good one. It is very good at detecting memory usage problems. Use it to improve the quality of your code. The issues with valgrind are (a) it is not portable, and (b) it is not a production environment tool. WHAT ARE THOSE BENEFITS AGAIN? * Protection against overwrites: The package cannot prevent overwrites; however it does provide means to limit damage caused by overwrites and to make their origin more detectable. It uses guard bytes to catch off by one and off by a few errors. Whenever a block is freed its guard bytes are checked for overwrites. When error reports are generated all guard bytes are checked. The package allocator does not comingle storage used by the allocator with storage used for allocation. This is an important feature; overwrites, except for fortuitous wild writes, do not affect the allocator and its information. * Action tagging: Every block under management is tagged with the call into the allocator that last touched the block and with a sequence number. These tags are invaluable as forensic aids when investigating allocator misuse. * Error trapping: The allocator traps errors, e.g., failure to allocate and erroneous frees, and aborts via an error handler that prints a detailed error report. * The package generates a detailed report upon demand or in response to a fatal error. The report includes a list of all blocks under management, both used and free. The entry for each block includes the block address, its sequence number, and its identifying tag. This report is useful for identifying memory leaks, possible origins of overwrites, stale allocations, and memory usage hotspots. * Open source: The source code is licensed under a BSD style license. It can be used and modified as desired as long as the license notice is not altered. * Light weight: Getspace is time and space efficient (but see caveats). Because it is, getspace can be in place of malloc without impacting performance. * Portable: The package is written in portable ISO C90. There are some portability issues that are noted below under caveats and considerations. THE GORY DETAILS ABOUT THE API There are six entry points to the getspace package, cfgspace, freespace, getspace, morespace, printspace, and queryspace. The prototypes are defined in getspace.h. They are: void cfgspace (struct cfgspace_s *); int freespace (void *, char * ); void * getspace (size_t, char * ); void * morespace (void *, size_t, size_t, char *); void printspace (FILE *, char *); size_t queryspace (void *); cfgspace: Cfgspace has one argument, a pointer to a configuration structure. Use this function to override the default values in the allocator. Its principal use is to change the allocator error file which is stderr by default. To use it create an instance of the configuration structure cfgspace_s defined in getspace.h. Initialize it to all zeroes and then set any desired fields to override values. For example: struct cfgspace_s cfg = {0,0,0,0,0,0,0,0,0}; ... cfg.fptr = my_report_file; cfgspace(&cfg); When cfgspace returns cfg is filled with current settings. See the section below on configuring the allocator for the configuration fields, their usage, their defaults, and their restrictions. freespace: Freespace has also two arguments, the address of the block of storage being freed, and the identifying string, typically LINELOC. Freespace returns TRUE (1) if the space was successfully freed and and FALSE (0) if it was not. By default, freespace aborts if the space was not successfuly freed. However the default action can be overridden by setting either of two flags, ufree and nokill. Freespace checks whether the address is legitimate, i.e., whether it is the address of a block of space allocated by getspace. (An address of 0 is treated as a no-op). If the ufree flag is set freespace assumes that unrecognized addresses came from malloc and calls free to deallocate them. (This obviously is dangerous but it can be convenient.) No error message is written, but FALSE is returned. If the nokill flag is set and the address is not legal, an error message is written and FALSE is returned. Common reasons for using the nokill flag is to protect against erroneously freeing a block twice, and to perform some speciallized recovery if an error should occur. getspace: Getspace has two arguments, the requested length and a pointer to an identifying string, typically LINELOC. A request of size 0 is an error. Failure of getspace to honor the request is also an error. The default action in response to an error is to write an error message, an error report, and abort. If the nokill flag is set an error message will be written and getspace will return a null pointer (0). A common reason for setting the nokill flag is to permit some kind of recovery action if getspace cannot allocate space. The returned pointer will be aligned in ordinary environments, but alignmentis not guaranteed. See caveats and conditions for a discussion of alignment. morespace: Morespace is an equivalent of realloc. It has four arguments. The first is the address of the buffer being resized; if it is 0 a fresh buffer is allocated. The second argument is the number of bytes to retain. The third argument is the number of bytes needed. Finally the fourth argument is the identification string. Morespace treats errors in the same way that getspace does. printspace: Printspace has one argument, a file pointer, that is assumed to be an error report. It prints a storage allocation report in the file. The report has four parts: (a) A table of storage usage information (b) The configurable elements settings (c) A table of the sizes of the address trees (d) The sequential storage map The storage allocation report is meant to serve as a resource for analyzing storage usage and as a forensic source for debugging memory allocation related errors. See the allocation report section for usage. queryspace: Queryspace is a utility. It has one argument, the address of a possibly allocated block of storage. It returns the block size, if any. If the block is not allocated it returns 0. Use queryspace if you want to know if a pointer is a valid pointer to a block or if you want to know how much usable space a block has. CONFIGURING THE ALLOCATOR (IF YOU MUST) For the most part the configuration parameters are not relevant to the user; they are there to facilitate debugging or to allow tuning of the allocator. The major exception is the error file option, fptr, which sets the error field. There are several flags that change default behaviour, e.g., nokill, zero, and ufree. Most of the parameters can be reset at any time. Some, however, can only be set during initialization. They must be set with a a call to cfgspace before any calls to the other getspace package entry points. The table below lists each of the options, when they can be set, who typically will set them, and what they are for. Item When Who What fptr anytime user This sets the error file. The default is stderr. If you are writing error logfiles set this parameter slabsz anytime admin The slab size; the default is 131072 bytes. Setting the slab size larger will mean fewer calls to malloc by getspace. nnode anytime admin The number of new nodes added to the node free list. It's hard to believe that it matters. nh startup admin This sets the number of address trees used by getspace. Increase this if getspace is managing many millions of blocks of space to tweak performance. nabits startup admin This is an option for 64 bit platforms. Currently it does nothing. idsize anytime admin The last field of a report is the identification string. This sets the maximum field size. The default is 20. ssz startup admin The number exact fit small lists. ufree anytime user The default response to an unrecognized address is to do an error exit. If this flag is set freespace will call free instead. Use this option for freeing space when you don't know whether it came from malloc or getspace. kslab anytime user Getspace gets storage from malloc in the form of large slabs. From time to time slabs may become completely empty. The default is to reuse them. If this flag is set getspace returns empty slabs to malloc. Use this option if memory usage is tight. dbgflag anytime user This flag exists for developers who are messing with the getspace code. Use this flag for turning on optional debug code. nokill anytime user The default response to a user error is to print an error message, a report, and exit. zero anytime user The default behaviour is for getspace not to initialize the space being delivered. If the zero flag is set the space will be zeroed out. THE STORAGE ALLOCATION REPORT The report is divided into six parts. Part I is a table of information about storage allocation. The information is fairly general and is self explanatory. Here is an example: I. Usage information table Current allocation ... 808 Maximum allocation ... 42008 Free space ........... 130264 Total space managed .. 131072 Current # of slabs ... 1 Maximum # of slabs ... 1 Number of nodes ...... 208 Ftree root node ...... 0 # of boundary nodes .. 2 # of blocks in use ... 1 # of free blocks ..... 2 All address trees are valid All large size trees are valid Part II lists the current settings of all configurable parameters. Here is an example: II. Configurable parameters Slab size ............ 131072 Max printed ID size... 20 Size of small list ... 256 Free list expansion .. 256 # of address hashes .. 1024 Use free flag ........ FALSE Delete empty slabs ... FALSE Debug flag ........... FALSE No kill on user error FALSE Zero out memory flag . FALSE The getspace package does not co-mingle allocator control information with storage that it allocates. This makes the allocator more secure. However when a block is deallocated the freespace routine has to locate the control block corresponding to the node. Getspace uses binary search trees for mapping addresses to control blocks. Instead of using one tree it hashes addresses into 1024 (nh) indicies of an array of trees. Part III lists the number of blocks in each tree. Part IV is the specialized tree used for finding the best fit small bin search tree. The various fields are printed out for each bin index. Part V is the available big blocks tree. It prints the traversal of the tree in ascending block size order. Part VI is the sequential storage map. It has one entry for each block of storage being managed by the getspace package. There are 13 columns, labelled as follows: U ............. Usage, S = slab boundary, U = in use, F = free E ............. Error code, X = neg size, E = overwrite size .......... Size of the block np ............ Pointer to node address ....... Address of the block sl ............ Pointer to preceding block sl->ad ........ Address of the preceding block sr ............ Pointer to following block sr->ad ........ Address of the following block al ............ Left linkage pointer ar ............ Right linkage pointer seq ........... Sequence number origin ........ Identification string Most of the columns are of little interest except for chasing bugs in the allocator - if there should actually be any. Three columns are particularly useful for analyzing storage usage faults, column 2 (E), column 12 (seq), and column 13 (origin). An E in column two indicates that either the eight bytes before the block or the eight bytes after the byte have been overwritten. The place where the block originated is given by the origin field. (This is why we want to use LINELOC). A sort on the sequence number will highlight blocks that were allocated early and never released. A sort on the origin will blocks that are allocated in a loop and not released. CAVEATS AND CONSIDERATIONS This is a relatively immature piece of software; it has only been built on one environment, the djgpp environment for windows. It has been extensively tested in that environment; howeverit has not yet been heavily used in a variety of production environments. The author does not expect that any bugs will be found. (He has been disappointed in such expectations in the past.) He would appreciate being notified of any bugs or problems encountered. Problem reports or comments can be sent to cri@tiac.net [2013: See note.] If possible, it would be useful if problem reports included a storage allocation report. There may be portability issues. The code is written in ANSI C89. Applications and test harnesses been compiled and built using gcc with these options: -O2 -ansi -pedantic -posix -Wall No errors or warnings were produced; the code seems to be quite clean. However the code has only tested and run in a 32 bit environment; 64 bit environments have not been tested. That said, here are some portability concerns that the author is aware of: (a) The address hashing code implicitly assumes a 32 bit architecture. (b) The reports print addresses (pointer values) in decimal. (c) It is assumed that adding multiples of 8 bytes to any address produced by malloc preserves alignment. UNDER THE HOOD If you are interested, here is some information about what is under the hood in the package and how it works. The getspace allocator is a slab based, best fit allocator. "Slab based" means that the allocator gets storage from the OS (actually malloc in this case) in the form of big blocks (slabs). This is in distinction to allocators that start life with a fixed amount of storage and with allocators that assume that the OS will expand a single monolithic block upon demand (think sbrk). Best fit allocators maintain an inventory of free blocks of various sizes. A best fit allocator honors a request with a block of the exact size if one is available; otherwise it finds the smallest size block in the inventory that is at least as large as the requested size. Best fit allocators tend to allocate memory very efficiently; however they present design problems. The problem is to select data structure(s) for maintaining the inventory that are time efficient as blocks are freed and allocated. Getspace uses a structure called a node to hold information about blocks of storage; each block has its own associated node. The node structure looks like this: struct node { /* Block control structure */ char *a; /* Address of block */ NODE *sl; /* Storage left pointer */ NODE *sr; /* Storage right pointer */ NODE *al; /* Left pointer */ NODE *ar; /* Right pointer */ char *id; /* Request identifier */ unsigned long seqno; /* Sequence # (epoch) */ }; (The structure is private to getspace.c; its name does not conflict with usage in any other file.) In the discussion below it should be remembered that blocks and nodes are 1 to 1. In the C universe it is normal for the allocate routine to return an address and for the argument to the deallocate routine to be an address. (There are alternatives; perhaps they should be considered but they were not for getspace.) This creates a problem; when freespace is handed an address it has to find the corresponding node for the block being returned. Many allocators handle this by embedding block information adjacent to the actual block. This is convenient but insecure; in getspace I have chosen to have nodes and the blocks they represent in separate areas. Accordingly getspace needs a way to map addresses to node. The method used is to first hash addresses to a fixed range (0:1023 is the default). The hashed address is used as an index into a table of binary search tree (BST) roots that is ordered by block addresses casted to size_t. This scheme is not strictly portable but it works in a 32 bit flat space implementation. Anyone porting to a different environment should check the address mapping code. Inserting and deleting addresses into the address trees (atrees) is handled by atree_enter_address and remove_from use. Note that the al and ar fields are used as left and right child links in atrees. This is a general pattern. During code execution a node will either be in exactly one data structure or in none at all. The al and ar fields are reused by the various data structures. Getspace needs is a way to find "best fit" blocks in its inventory of blocks. As a first step all requested sizes are rounded up to the nearest multiple of ASIZE bytes and an additional ASIZE bytes is added to allow room for check bytes. The macro ASIZE is 8; it is the alignment size. Since all sizes are multiples of ASIZE it is convenient to divide rounded up sizes by ASIZE to get a size index. The package divides blocks into two kinds, small blocks and big blocks, with separate data structures being used for small blocks and big blocks. The configurable parameter, ssz, is used as the boundary. Ssz is in index units, i.e, the largest "small" block size is ASIZE*ssz. The default value for ssz is 8191; the default largest small block size is 65,528. Big block nodes are stored in a BST called big. It isordered by size, and, for blocks of equal size, ordered by address. Since the number of big blocks should be small, I have not bothered with balancing it. There are several data structures for small blocks. For each small size there is a circular doubly linked list of nodes; insertions and deletions are simple O(1) operations. Each list has a list header; which node is chosen as list header doesn't matter - the code merely ensures that the header is in the list. The al and ar fields are used for the list links. The list headers are contained in an array called small; if a list is empty the corresponding array entry is nil (0). There is a problem with these scheme; how does getspace find the best fit index if there is no entry requested size index? A separate data structure, sdata, is used to efficiently search the array for the smallest entry in the array with content that is larger than the requested index. The algorithm used is arcane and I shan't explain it here. The upshot is that a search for a best fit is O(log G) where G is the average gap size between entries with content. The function place_in_free is used to insert blocks in the free blocks data structures. When freespace receives a block address for freeing the normal sequence of events is: * The node corresponding to the address is found and the node is removed from its data structure. * If the storage right neighbour is free the right neighbour is removed from its data structure and it is merged with the newly freed block. * Ditto the left storage neighbour. * The possibly larger freed block is entered in the free blocks data structures. When getspace receives a request the normal sequence of events is: * The input size is rounded up to a request size and its size index is determined. * Function find_available tries to find a best fit node. * If none is found stg_new_slab is called to get a new slab; the slab becomes the "best fit" block. * If the found block is larger than the requested size stg_split_block cuts the block into two pieces; the surplus is entered into the free blocks data structures. * The possibly trimmed found block is made available to the user with the place_in_use routine. PROGRAMMING NOTES FOR THE GETSPACE STORAGE ALLOCATOR Note on allocation strategy: The allocator distinguishes between big blocks (>2048 bytes) and small; blocks. Blocks are allocated in units of 8 bytes. A list is maintained for sizes up MXSZ units (2048 bytes) which heads a linked list of all free blocks of that size. A subsidiary list is used when the regular list of 'small' blocks is sparse. The package gets space from malloc in 131K blocks for allocatable space; node space is also taken from malloc. The package gets storage from the system via malloc in the form of large blocks (slabs). It then breaks them into blocks as needed. Usage of the avail links, al and ar: The two links, al and ar, have four distinct usages. These are: (a) The node free list. This is a singly linked list of nodes that are not in use. Nodes are returned to the free list when blocks are merged or when a slab is freed. Nodes are fetched from the list when a new slab is added or when a block is split. Only the ar link is used. (b) Address map trees. This tree is used for blocks in use. It is a binary tree with the address as the key. (See note 1.) The al and ar links are the left and right children links. There are NH address map trees; there is a simple hash function that is used to choose to which tree an address belongs. (c) The small free blocks table. This is a table of doubly linked lists indexed linearly on size for small sizes. (d) The large free blocks table. This is a table of binary trees indexed on powers of two. Note 1: Address comparisons are not done directly. The addresses are casted to size_t for comparison.