Save our prairies! View the You Tube video here.
Richard Harter's World
Site map
November 2007
Mathcomp
Getfline design doc
email

Getfline specification

Note: This is the original published specification for getfline. It has replaced by gflspec.html.

Synopsis:

#include "getfline.h"

int getfline(FILE   * fptr,
             char  ** line_ptr,
             size_t * length_ptr,
             size_t * bufsize_ptr,
             size_t   maxlen,
             long   * flags);

Note: getfline.h is guaranteed to include stdio.h.

Description

Function getfline reads lines from a stream. Each time it is called it fetches the next line until there are none left. Getfline has fairly elaborate error detection and reporting facilities. It also permits minimizing storage allocation and deallocation.

Calling sequence arguments

Argument fptr is the stream from which to read characters. It must have previously been opened with fopen. fptr==NULL is an argument error; other invalid values for n produce an I/O error.

Argument line_ptr is the address of a pointer to the buffer for reading the line. In "clean copy" mode the previous value of line_ptr is overwritten with a pointer to a newly allocated buffer; the user is responsible for freeing all buffers. In "transient mode" the value of line_ptr is controlled by getfline; it should initially be NULL.

Argument length_ptr is a pointer to a size_t variable that holds the length of the line when there is a successful read.

Argument bufsize_ptr is a pointer to a size_t variable that holds the allocated size of the buffer holding the line.

Argument maxlen is the maximum line length that getfline will return.

Finally argument flags is a pointer to a status word. The status word is divided into two sections, input flags and output flags. When getfline is called it uses the input flags to determine the course of action. When it returns the status word will hold output flags that indicate what happened during the read. It should be initialized by oring together the desired input flags.

Function return

Getfline returns 1 if a line was returned and 0 if no line was returned. A return of 0 can either indicate the EOF was reached or that the read was terminated by an error.

Usage modes

Getfline can produce either "clean copies" or "transient copies". A clean copy is one that has no encumbrances - it lasts until it is specifically deleted. A transient copy is one that lasts until the next call to getfline to get another line from the current file. Transient copy mode will be used if the GFL_FLAG_TRANS flag is set; clean copy mode is the default.

In clean copy mode getfline allocates a new buffer for each line that is read. The user is responsible for freeing the storage for the lines. Clean copy mode is appropriate when we want to keep part or all of the file in memory. In clean copy mode the values of items pointed to by line_ptr, length_ptr, and bufsize_ptr are overwritten; the previous values are ignored.

In transient copy mode getfline reuses the line buffer; the previous contents, if any, will be overwritten. Getfline handles buffer storage management; the user does not have to allocate or free space for lines. Transient copy mode is appropriate when the contents of a line are immediately processed. Transient copy mode may be more efficient than clean copy mode because it minimizes the number of calls to malloc and free.

In transient copy mode the line buffer pointer must initially either be NULL or point to storage allocated by malloc, and bufsize must contain the allocated size of the line buffer. The user should not alter these variables after initializing them. When a read fails the line buffer storage is freed by getfline and the line pointer is cleared.

Handling anomalous lines

There are two kinds of anomalous lines that can occur, a prematurely terminated last line (i.e., one that lacks an EOL (\n) marker), and lines that are longer than maxlen. A prematurely terminated last line will be treated as error unless the GFL_FLAG_ADDEOL flag is set. Whether or not it is treated as an error, the GFL_RD_BADEOF flag will be set.

Getfline offers four different ways to handle "long" lines; they can be treated as errors, they can be omitted, they can be truncated, or they can be cut into pieces that are maxlen bytes long. Regardless of the method chosen, if the line is long the GFL_RD_LONG flag will be set in the status word.

The default is to treat "long" lines as errors. One of the other three choices can be setting one of the following three flags: GFL_FLAG_OMIT, GFL_FLAG_TRUNC, or GFL_FLAG_CUT. Only one of these flags can be set; setting more than one is an arguments error.

There are two flags, GFL_RD_LONG and GFL_RD_BADEOF, that are set if the current line being read is anomalous. These flags are set even if the line is acceptable. The GFL_RD_BADEOF flag is set if the last line was prematurely terminated. The GFL_RD_LONG if the current line is long. If the GFL_FLAG_CUT is set the GFL_RD_BADEOF flag will not be set when the final piece of a long line is read.

Responses to errors by getfline

If there is an error getfline will set three flag bits in the status word. One is the general error flag, GFL_ERR; the second is a error category flag; and the third is the specific error. There are two error categories, argument errors, and execution errors; the corresponding flags are GFL_ARG and GFL_EX. There is a separate flag for an error (usually the argument being a NULL pointer) in each of the arguments.

There are three different execution errors, I/O errors, storage allocation errors, and unacceptable line errors. An I/O error is reported if there is a read error. A storage allocation error if malloc or realloc fails. An unacceptable line is either a long line or a prematurely terminated line for which there is no input flag.

There are two input flags, GFL_FLAG_LOG and GFL_FLAG_EXIT, that specify actions that getfline should take if there is an error. If the GFL_FLAG_LOG flag is set getfline will write an error message to stderr when there is an error. If the GFL_FLAG_EXIT is set, getfline will call exit when there is an error.

Sample usage

Here are a couple of usage examples. For each we assume the following includes and declarations:

#include "getfline.h"
...
    FILE *fptr;
    char *line = 0;
    size_t len, bufsize = 0;
    long status;
Example one illustrates clean copy mode. In example one we are reading a file called somefile.txt. Each line is passed to processing function that takes ownership of the lines and the responsibility for freeing their storage.
    fptr = fopen("somefile.txt","r");
    while(getfline(fptr,&line,&bufsize,1024,&status)) {
        process_line(line);
    }
    if (status & GFL_ERR) fprintf(stderr,"Oops\n");
    if (fptr) fclose(fptr);
Example two illustrates transient copy mode. In example two we are reading input from stdin and writing it to stdout. However in lines longer than 80 characters we insert new line characters every 80 characters.
    status = GFL_FLAG_CUT | GFL_FLAG_LOG;
    while(getfline(stdin,&line,&bufsize,80,&status)) {
        fprintf(stdout,"%s\n",line);
    }

Table of flags

Input flags:
    GFL_FLAG_TRANS      Provide transient copies.  The default is
                        clean copies.
    GFL_FLAG_EOFOK      Says it's okay if the last line is
                        terminated by a premature EOF.  The 
                        default is to treat a premature EOF
                        as an error.
    GFL_FLAG_CUT        Says to use all of a long line but
                        break into pieces of length maxlen.
    GFL_FLAG_TRUNC      Says to use the first maxlen bytes
                        of a long line and discard the rest.
    GFL_FLAG_OMIT       Ignore long lines. 
    GFL_FLAG_EXIT       Exit if there is an execution error.
    GFL_FLAG_LOG        Write an error message to stderr if
                        there is an execution error.
                        
General error flags:
    GFL_ERR             This is set if there was any error.
    GFL_ARG             This is set if there are any
                        argument errors.
    GFL_EX              This is set if there are any
                        execution errors.
                        
Input argument error flags:
    GFL_ARG_FILE        An invalid file pointer (presumably
                        a null pointer) was passed.
    GFL_ARG_LINE        The pointer to the line argument was
                        a null pointer.
    GFL_ARG_LENGTH      The pointer to the length_ptr argument
                        was a null pointer.
    GFL_ARG_BUFSIZE     The pointer to the bufsize_ptr argument
                        was a null pointer.
    GFL_ARG_STATUS      The pointer to the status argument
                        was a null pointer.
    GFL_ARG_FLAGS       Multiple long line action flags are set.
    
Execution error flags:
    GFL_EX_IO           There was an I/O error in trying to
                        read the file.
    GFL_EX_STORAGE      There was an error in the storage
                        allocation routines.
    GFL_EX_READ         The file had a format error (long
                        line or premature EOF) and no flag
                        was set for handling the error.
                        
Anomalous line flags:
    GFL_RD_LONG         This is set if the line is long.
    GFL_RD_BADEOF       This is set if there was a premature
                        end of file.


This page was last updated November 9, 2007.

Richard Harter's World
Site map
November 2007
Mathcomp
Getfline design doc
email
Save our prairies! View the You Tube video here.