PDF Guidance

General information on PDFs. 2
PDF Usage. 2
Planning For Accessibility. 3
Testing. 4
How-To. 5
Converting Documents. 5
Microsoft Office. 5
Reverse conversion. 5
Tagging Documents. 5
Adobe Acrobat 5

General information on PDFs

Portable document format (PDF) files are just about the most common form of publishing documents electronically.  The five biggest advantages of using PDF documents are that:
    • The software to read them is free and readily available.
    • Layout, color, fonts and design are locked into place.
    • PDFs are easy to create
    • The integrity of the document is maintained.
    • Reviewing documents and inserting comments is easy.

The disadvantages of using PDF documents are:

    • They are slow to download.
    • They are hard to navigate.
    • They are inaccessible to those with mobility impairments.
    • They are inaccessible to those with visual impairments.

The latter two raise issues with human rights, particularly in relation to public documents.

When people talk about the accessibility of Adobe Acrobat or PDF files, they are usually referring to the accessibility of Acrobat to screen readers, but screen reader users are not the only people who should be considered when creating accessible PDF files.

A PDF is a database of different data types.  The mere presence of tags does not guarantee accessibility because they may be used wrong, but the absence of tags guarantees that the PDF itself is not accessible.

    • Acrobat versions since 4.05 have been at least adequately competent some of the time in making “inaccessible” PDFs functionality accessible.
    • Acrobat 5 and later can infer a reading order and reflow text.
    • Acrobat 6 and later can read text out loud, functioning as a de facto screen reader.

PDF Usage

Unfortunately, the portable document format is grossly overused.  In general, only some documents should be a PDF document.  Any simple text-and-graphic document that is typeset in a single column should be provided as an ordinary HTML/CSS web page.

The only exceptions are the following:

    • Footnoted, endnoted, or sidenoted, since there is no way to mark up any of those structures in HTML.
    • An interactive form, since PDF interactivity can do more than HTML can.  (Use with caution and only if HTML really cannot do what you want.)
    • A multimedia presentation, since later versions of PDF can truly embed multimedia rather than simply refer to or call multimedia, as HTML does.  PDF multimedia can include captions and/or audio descriptions.
    • Combined accessible and inaccessible versions.  A typical case is a scan of a historical document that also includes live text.
    • Custom-crafted solely for printing.
    • Designed for annotation and round-trip travel: If you’re posting something to elicit comments, which are then sent back to you, PDF has useful structures that HTML doesn’t.
    • A type specimen, which are all but impossible to create in HTML, unless the specimen involved is a “typeface” like Arial.
    • A sample of a format that cannot be rendered in a browser (e.g. Illustrator or Photoshop documents) or can only be rendered unsatisfactorily (CAD drawings where GIF and JPEG don’t have enough resolution).  This case also includes PDF files meant as samples of PDF files.
    • A record of a document’s state at a specific moment.  In this context, PDF is useful as a preservation format even for HTML web pages.
    • Contains mathematical notations.
    • Documents with a legally restricted format, like U.S. tax forms.
    • Multi-columnar, particularly if figures and illustrations are included, since multicolumn web layouts are unreliable as a method of reproducing print layouts.  (Your multicolumn document should be HTML if it is presented that way merely to save paper and it can work as a single column.  It can be difficult to distinguish that case from a document that is structurally multi-columnar)

Not only does providing as many documents as possible in HTML allow for better compliancy, it also allows for better search engine response.  Ill-structured PDF documents will not be indexed by search engines if they cannot be properly read (compliant).  For information regarding this, please see the section “Testing”

Planning For Accessibility

Use the latest version of publishing tools that support accessibility features.  The latest versions of Word, InDesign, PageMaker, and Quark create better-tagged Adobe PDF files, which have greater functionality for accessibility than the structured Adobe PDF created from older versions of software.

Define a logical reading order for your document.  Logical reading order, or logical structure, refers to the organization of a document, such as the title page, chapters, sections, and subsections.  This logical structure provides a mechanism to indicate the precise reading order and improve navigation, particularly for longer, more complex documents.  In addition, when viewing a tagged Adobe PDF file in which the logical read order has been clearly defined, a user can use Acrobat’s Reflow feature to zoom in to any portion of the document and the text will automatically re-flow to fit the available screen space.

Use application-based styles to format text and define and create document structure such as titles, chapters, headings, and paragraphs.  Styles provide structure information when you create a tagged Adobe PDF file.  For example, do not use the Enter key to add space between paragraphs.  Instead, use the "Spacing Before" and "Spacing After" paragraph properties to achieve this effect.

Create column layouts using your application’s column layout feature.  Don’t use tabs to simulate double-column text.  For example, if a document has been correctly authored using two columns to create a two-column format, the screen reader knows it should read all the way down the first column and then proceed to the second column.  On the other hand, if the writer used tabs to imitate the look of two-column text, the screen reader would simply read horizontally, going from the first line in the first column and then tabbing over to the first line in the second column.

Create tables using your application’s table creation feature.  Don’t use tabs or graphics to create a table.  It is also helpful to use table formats in the authoring application, such as table column heading, row heading, table cell data, etc.

Avoid complicated table structures using merged and split cells, and nested and combined tables to produce a desired layout.  Avoid using tables for layout purposes.  Complex tables are difficult to impossible to export accessibly, and you will end up spending hours re-tagging your tables in Acrobat.

Use Unicode text, which is a standard for describing text characters.  This ensures that all characters and words are presented to assistive technologies in a clear and understandable manner.  Unicode also differentiates between soft and hard hyphens.  As a result, a hyphenated word that spans two lines, such as "com-puter," can be read as a single word.

Embed all fonts when creating a PDF file from your publishing application.  This will allow for touch-ups that might become necessary in your final PDF file.  If the font has been subsetted, you will not be able to edit the text from within Acrobat.

Group complex illustrations.  If you created an illustration out of several smaller illustrations, use the Group command to group them into a single illustration.

Add alternate text to images.  Include equivalent text descriptions for graphics, so that a blind person using screen reader software can understand the purpose of the graphics.  Keep in mind that repeating images with long text descriptions will become very tiresome–so label accordingly.  Some graphics are present to add color and visual appeal to a document.  These document elements, which are referred to as artifacts, do not need alternate text since they are not adding to the message of the document.

Do not rely on color to convey information.  If color is used to convey important information, an alternative indicator must be used, such as an asterisk (*) or other symbol.

Planning for Accessibility for people with other types of disabilities should also play apart in your document creation process.

    • Motor Disabilities
      Don’t make hot spots too small.  Of course, the phrase “too small” is relative, and it is true that people can enlarge the document, thus enlarging the hot spots within the document, but use good judgment here.  The smaller the link, the more difficult it will be for someone with limited fine muscle control to click on the link.
    • Hearing Disabilities
      Provide transcripts for multimedia.  If you embed multimedia objects with sound in your PDF documents, you will exclude both the deaf and the deaf-blind if you do not provide a transcript.

      Provide synchronized captions for video.  People who are deaf need this if the video does not make sense when the sound is turned off.
    • Low Vision
      Ensure there is enough contrast in the PDF document.

      Ensure that any information conveyed with color is conveyed equally well when color is not available.  You may want to use a textual clue in addition to the color in order to convey the information.

Testing

One of the easiest tests for checking whether or not your document has been tagged correctly and is compliant is a simple Google search using the “View as HTML” option.  If Google cannot decipher your document, then more than likely it is not compliant.  If your document has been tagged properly the HTML version Google presents will have content that flows in the order of the source document and it will make sense.

Another method to test compliancy of your PDF is to open it within Adobe Acrobat (not the browser plug-in) and export it as text and review the reading order.

How-To

Converting Documents

Microsoft Office

When converting from Microsoft Office, only well–marked-up Office documents will produce well-marked-up HTML or PDF files.  Garbage in: Garbage out.

Acrobat 7.0 for Windows allows authors to create tagged Adobe PDF files directly from Microsoft Office 2000 and Office XP for Windows.  The tagged file incorporates logical structure and alternative text descriptions for graphics, making it easier for users to navigate a document in the proper reading order and understand the meaning of graphics.

In MS Office products you must use real headings (not just large, bold fonts), bullets, numbered lists and other structural tags in the original Office document.  If you do not, then the correct tags will not be created when the document is converted into HTML or PDF.  For most, this will mean learning how to use the structural elements within Word, as many people don’t pay attention to “style” options in word processors.  Instead, most pay attention to the visual output.  This will need to change in order to make the content accessible and usable to screen readers.

Reverse conversion

When converting a PDF file to HTML, it is always best to go back to the source.  Sometimes the original file used to create the PDF is unavailable.  In that case you can create an HTML file using Acrobat, but the file will probably be more complex and will require more work to make it accessible.

Tagging Documents

Acrobat 7.0 automatically analyzes a document’s logical structure and creates a new version that approximates the original structure and reading order.  In most cases, this file will translate better with a screen reader than an untagged file will.  You can also use this tool in conjunction with the Acrobat 7.0 batch processing function to convert volumes of documents efficiently.

Remember to always provide alternative text for any graphics in your source document.

Adobe Acrobat

The basic steps:

    1. Open the PDF
    2. The Description pane of the Document Properties screen (File Menu) will tell you if the document is tagged or not.
    3. If it isn’t, dismiss that screen.  Go to the Advanced menu and choose Accessibility -> Add Tags to Document
    4.   Run a full accessibility check from that same menu.
    5. If the checker reports any problems, open the tags palette (View -> Navigation Tabs -> Tags).  Use the disclosure triangles to step through your documents new tag structure.  You’re better off if you select Highlight Content from the palette’s Options menu, as Acrobat will then draw a hard-to-see border around the object whose tag you select.

Department of Commerce Web Advisory Council (WAC)
U.S. Department of Commerce

Send questions and comments about this page to WAC@doc.gov
Page last updated October 12, 2010