CCITT Fax Filter

From GNUpdf

PDF Reference 3.3.5 CCITTFaxDecode Filter:

The CCITTFaxDecode filter decodes image data that has been encoded using either Group 3 or Group 4 CCITT facsimile (fax) encoding. 
CCITT encoding is designed to achieve efficient compression of monochrome (1 bit per pixel) image data at relatively low resolutions, 
and so is useful only for bitmap image data, not for color images, grayscale images, or general data.

The CCITT encoding standard is defined by the International Telecommunications Union (ITU), formerly known as the Comite Consultatif 
International Telephonique et Telegraphique (International Coordinating Committee for Telephony and Telegraphy). The encoding algorithm 
is not described in detail in this book but can be found in ITU Recommendations T.4 and T.6. For historical resons, we refer to these 
documents as the CCITT standard.

CCITT encoding is bit-oriented, not byte-oriented. Therefore, in principle, encoded or decoded data might not end at a byte boundary. 
This problem is dealt with in the following ways:

   * Unencoded data is treated as complete scan lines, with unused bits inserted at the end of each scan line to fill out the last byte. 
     This approach is compatible with the PDF convention for sampled image data.
   * Encoded data is ordinarily treated as a continuous, unbroken bit stream. The EncodedByteAlign parameter can be used to cause each 
     encoded scan line to be filled to a byte boundary. Although this is not prescribed by the CCITT standard and fax machines never do 
     this, some software packages find it convenient to encode data this way.
   * When a filter reaches EOD, it always skips to the next byte boundary following the encoded data.

If the CCITTFaxDecode filter encounters improperly encoded source data, an error occurs. The filter does not perform any error correction 
or resynchronization, except as noted for the DamagedRowsBeforeError parameter.

The compression achieved using CCITT encoding depends on the data, as well as on the value of various optional parameters. For Group 3 
one-dimensional encoding, in the best case (all zeros), each scan line compersses to 4 bytes, and the compression factor depends on the 
length of a scan line. If the scan line is 300 bytes long, a compression ratio of approximately 75:1 is achieved. The worst case, an image 
of alternating ones and zeros, produces an expansion of 2:9.

Getting the standards:

Parameters used for the CITT decompression filter
parameter type meaning default value
K integer

A code identifying the encoding scheme used:

  • < 0 pure two-dimensional encoding (Group 4)
  • 0 pure one-dimensional encoding (Group 3, 1-D)
  • > 0 Mixed one- and two-dimensional encoding (Group 3, 2-D), in which a line encoded one-dimensionally can be followed by at most K − 1 lines encoded two-dimensionally.

The filter distinghishes among negative, zero, and positive values of K to determine how to interpret the encoded data; however, it does not distinghish between different positive K values.

0
EndOfLine boolean

A flag indicating whether end-of-line bit patterns are required to be present in the encoding. The CCITTFaxDecode filter always accepts end-of-line bit patterns, but requires them only if EndOfLine is true.

false
EncodedByteAlign boolean

A flag indicating whether the filter expects extra 0 bits before each encoded line so that the line begins on a byte boundary. If true, the filter skips over encoded bits to begin decoding each line at a byte boundary. If false, the filter does not expect extra bits in the encoded representation.

false
Columns integer

The width of the image in pixels. If the value is not a multiple of 8, the filter adjusts the width of the unencoded image to the next multiple of 8 so that each line starts on a byte boundary.

1728
Rows integer

The height of the image in scan lines. If the value is 0 or absent, the image's height is not predetermined, and the encoded data must be terminated by an end-of-block bit pattern or by the end of the filter's data.

0
EndOfBlock boolean

A flag indicating whether the filter expects the encoded data to be terminated by an end-of-block pattern, overriding the Rows parameter. If false, the filter stops when it has decoded the number of lines indicated by Rows or when its data has been exhausted, whichever occurs first. The end-of-block pattern is the CCITT end-of-facsimile-block (EOFB) or return-to-control (RTC) appropriate for the K parameter.

true
Blackls1 boolean

A flag indicating whether 1 bits are to be interpreted as black pixels and 0 bits as white pixels, the reverse of the normal PDF convention for image data.

false
DamagedRowsBeforeError integer

The number of damaged rows of data to be tolerated before an error occurs. This entry applies only if EndOfLine is true and K is non-negative. Tolerating a damaged row means locating its end in the encoded data by searching for an EndOfLine pattern and then substituting decoded data from the previous row if the previous row was not damaged, or a white scan line if the previous row was also damaged.

0

Contents

Group 3 one-dimensional coding scheme

The one-dimensional run length coding scheme recommended for Group 3 terminals is as follows. Data

A line of data is composed of a serie of variable length code words. Each code word represents a run length of either all white or all black. White runs and black runs alternate. A total of 1728 picture elements represent one horizontal scan line of 215 mm length.

In order to ensure that the receiver maintains colour synchronization, all data lines will begin with a white run length code word. If the actual scan line begins with a black run, a white run length of zero will be sent. Black or white runs lengths, up to a maximum length of one scan line (1728 picture elements or pels) are defined by a list of code words. The code words are of two types:

  • Terminating code words
  • Make-up code words

Each run length is represented by either one terminating code word or one make-up code word followed by a termination code word.

Run lenghts in the range of 0 to 63 pels are encoded with their appropiate terminating code word. Note that there is a different list of code words for black and white run lenghts.

Run lengths in the range of 64 to 1728 pels are encoded first by the make-up code word representing the run length which is equal to or shorter than the required. This is then followed by the terminating code word representing the difference between the required run length and the run length represented by the make-up code. End-of-line (EOL)

This code word follows each line of data. It is a unique code word that can never be found within a valid line of data; therefore, resynchronization after an error burst is possible.

In addition, this signal will occur prior to the first data line of a page.

  • Format: 000000000001

Fill

A pause may be placed in the message flow by transmitting Fill. Fill may be inserted between a line of data an EOL, but never within a line of data. Fill must be added to ensure that the transmission time of data, fill and EOL is not less than the minimum transmission time of the total coded scan line established in the pre-message control procedure. The maximum transmission time of Fill bits shall be less than 5 seconds.

  • Format: variable length string of 0s44.

Return to control (RTC)

The end of a document transmission is indicated by sending six consecutive EOLs. Following the RTC signal, the transmitter will send the post message commands in the framed format and the data signalling rate of the control signals defined in ITU-T Rec. T-30.

  • Format: 000000000001...(6 times)...000000000001

Group 3 two-dimensional coding scheme

The two-dimensional coding scheme is an optional extension of the one-dimensional coding scheme specified in 4.1 and is as follows. Parameter K

In order to limite the disturbed area in the event transmission errors, after each line coded one-dimensionally, at most K-1 succesive lines shall be coded two-dimensionally. A one-dimensionally coded line may be transmitted more frequently than every K lines. After a one-dimensional line is transmitted, the next series of K-1 two-dimensional lines is initiated. the maximum value of K shall be set as follows:

  • Standard vertical resolution: K = 2
  • Optional higher vertical resolution:
    • 200 Lines/25.4 mm, K = 4
    • 300 Lines/25.4 mm, K = 6
    • 400 Lines/25.4 mm, K = 8
    • 600 Lines/25.4 mm, K = 12
    • 800 Lines/25.4 mm, K = 16
    • 1200 Lines/25.4 mm, K = 24