I’m sure you’ve found yourself in this situation before: you turn to search engines like Google or Bing to research a topic and come across rather obscure-looking titles or unhelpful descriptions embedded in the results pages. They could look something like this:

Search results with errant document metadata.

What is this stuff, and why is it in your results?

The Origins of Document Metadata

What you’re seeing is called “document metadata,” and it is automatically included in programs such as Microsoft Office. Metadata typically consists of the document’s author and a title. Unless you or your IT staff has customized your program’s settings, the values for these items are usually generated from your login name and the first line of text in your document.

You probably won’t notice document metadata if you pass files back-and-forth through e-mail, but it becomes both visible and vital if these files are posted to the web or internal, web-based collaboration tools within your organization.

Why should you pay attention to metadata?

The Importance of Document Metadata

Correct and valid document metadata can go to great lengths to:

  • Improve the user’s experience. The online content your organization publishes will be difficult for users to identify if it comes up with the title “Chapter1.doc” in the results. This goes for both the search engines themselves as well as your own website’s search engine.
  • Boost your visibility in search engine result pages (SERPs). Adding relevant keywords to the title, author, and description of your documents can push these items up in SERP rankings.
  • Reduce the chances of releasing internal or sensitive data to the public. The default settings for your documents’ metadata could include non-public, potentially sensitive, or personally identifiable information such as user names, contact information, and server or network names, among other details. Errant metadata could violate legal disclosure agreements or give social engineers valuable clues into the nature of your internal network.

Microsoft has a helpful article that shows you how to find and update (or remove) metadata in your documents. Metadata is also passed along when you create PDFs from your office application, so you should take care of your data at the source.

Document Properties and First Impressions

Another tip that may help, especially when creating PDFs, is changing some of “initial view” settings in the document properties. These settings determine how the PDF will appear whenever the user opens the document in the browser or their desktop.

I personally prefer the following defaults in Adobe Acrobat:

My preferred document properties for Adobe Acrobat PDFs.

These settings guarantee that the user sees most, if not all, of the first page in their screen instead of a part of a graphic, magnified text, or, worse, nothing at all … which can happen if the top half of your opening page is blank (for example, if you break print publications into article PDFs without cover pages). Showing the document title helps when the file is open on user’s desktop: if they have more than one window open, the title can help them distinguish between it and files in other open windows.