Open XML II: Editing Documents in the XML

The top portion of a document.XML  file

This image shows the top portion of a document.xml file. Notice that the XML tags (such as <w:body> and <w:p>) are nested within one another.

At the top of the document.xml file, you see a few elements that are similar across most document parts in an Open XML Format ZIP package.

  • The XML version information typically appears at the top of the file, as you see here in blue.
  • The information in red is a set of namespace references. (Notice the text xmlns in this portion of the content, which means XML namespace.) A namespace is just a way to uniquely identify content within the package. This is not content you need to work with for tasks such as those in this course.

The most important thing to notice here is the XML structure. The first tag (<w:document…>, which begins on the second line in the image) is the outermost tag in the file.

  • The next tag down, which starts the body of the document, is <w:body>.
  • Nested within the body tag is all content in the body of the document, such as the first paragraph, shown here.
  • The paragraph starts with the tag <w:p> (which translates to Word paragraph) and ends with the tag </w:p>. Most tags in an XML document part are paired in this way, where a slash character indicates the end of the content defined by that tag. The two parts of a tag are referred to as start and end tags.
