Well-formed vs valid

These two terms are often used interchangeably, but they mean different things and you usually want only one of them.

Well-formed means the document obeys XML's basic syntax rules. There is exactly one root element, every start tag has a matching end tag, elements are properly nested, attribute values are quoted, and reserved characters are escaped. Well-formedness is a property of any XML document and requires no external definition — a parser can check it on its own.

Valid is a stronger claim: the document is well-formed and conforms to a schema — a DTD, XSD (XML Schema), or RELAX NG grammar — that defines which elements may appear, in what order, with which attributes and data types. Validation requires that schema. All valid XML is well-formed; the reverse is not true.

Most of the time, when people ask to "check XML syntax" or "verify XML," they mean well-formedness — does it parse? That's what a browser-based checker tests. Validation against a schema is a separate, more specialized step.

The errors that break well-formedness

Almost every "XML won't parse" problem is one of these:

  • Unclosed or mismatched tags. <name>Alice</Name> fails — XML is case-sensitive, so name and Name are different tags.
  • More than one root element. A document may have exactly one top-level element. Two siblings at the top is the classic "junk after document element" error.
  • Unescaped special characters. A literal &, <, or > in text must be written as &amp;, &lt;, &gt; (or wrapped in <![CDATA[ ]]>).
  • Unquoted attribute values. <book id=bk101> is invalid; attribute values must be in single or double quotes.
  • Improper nesting. <b><i>text</b></i> closes the parent before the child. Tags must nest like matched parentheses.
  • Content before the declaration. The <?xml version="1.0"?> declaration, if present, must be the very first thing — not even a blank line or BOM before it.

A parser reports the first error it hits with a line and column number, then stops — XML parsers do not recover and continue, so fix the first error and re-check.

Checking in the browser

The browser has a built-in XML parser, which is what online checkers use:

const doc = new DOMParser()
  .parseFromString(xmlString, "application/xml");
const error = doc.querySelector("parsererror");
if (error) {
  console.log("Not well-formed:", error.textContent);
} else {
  console.log("Well-formed ✓");
}

If parsing fails, the browser inserts a <parsererror> element describing the problem, usually with a line number. This runs entirely client-side, so the XML never leaves the page.

Checking on the command line with xmllint

xmllint ships with libxml2 and is preinstalled on most Linux and macOS systems. It is the standard way to check XML from a shell or in CI:

# Well-formedness check (no output = success)
xmllint --noout file.xml

# Validate against an XSD schema
xmllint --noout --schema schema.xsd file.xml

# Validate against a DTD
xmllint --noout --dtdvalid doc.dtd file.xml

On success it prints nothing and exits 0 — handy for scripting. On failure it prints the error with a line number and exits non-zero.

Checking in Python

import xml.etree.ElementTree as ET

try:
    ET.parse("file.xml")
    print("Well-formed ✓")
except ET.ParseError as e:
    print(f"Not well-formed: {e}")  # includes line:column

For schema validation, the lxml library adds lxml.etree.XMLSchema, which checks a parsed document against an XSD and reports exactly which rule failed.

Check your XML syntax now

Paste your XML and get an instant well-formedness check with the exact line of any error. Runs in your browser — nothing is uploaded.

Open XML Validator →

Frequently Asked Questions

What is the difference between well-formed and valid XML?

Well-formed means the document follows XML syntax rules — one root, every tag closed and nested, attributes quoted, special characters escaped. Valid means it is well-formed AND conforms to a schema (DTD, XSD, or RELAX NG). All valid XML is well-formed, but not all well-formed XML is valid.

What are the most common XML syntax errors?

Unclosed or mismatched tags, more than one root element, unescaped & < or > in text, missing quotes around attribute values, and improper nesting. Any one of these makes the document not well-formed and stops parsing at that point.

How do I check XML syntax on the command line?

Use xmllint, which ships with libxml2 on most Linux and macOS systems. Run xmllint --noout file.xml to check well-formedness; it prints nothing on success and an error with a line number on failure. Add --schema or --dtdvalid to also check validity.

Is XML case-sensitive?

Yes. Element and attribute names are case-sensitive, so <Book> and <book> are different elements and a start tag must match its end tag exactly in case. This is a frequent source of "mismatched tag" errors for people coming from HTML.