Advertisement
Advertisement


How to pretty print XML from the command line?


Question

Related: How can I pretty-print JSON in (unix) shell script?

Is there a (unix) shell script to format XML in human-readable form?

Basically, I want it to transform the following:

<root><foo a="b">lorem</foo><bar value="ipsum" /></root>

... into something like this:

<root>
    <foo a="b">lorem</foo>
    <bar value="ipsum" />
</root>
2017/05/23
1
544
5/23/2017 12:10:45 PM

Accepted Answer

libxml2-utils

This utility comes with libxml2-utils:

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    xmllint --format -

Perl's XML::Twig

This command comes with XML::Twig module, sometimes xml-twig-tools package:

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    xml_pp

xmlstarlet

This command comes with xmlstarlet:

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    xmlstarlet format --indent-tab

tidy

Check the tidy package:

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    tidy -xml -i -

Python

Python's xml.dom.minidom can format XML (both python2 and python3):

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    python -c 'import sys;import xml.dom.minidom;s=sys.stdin.read();print(xml.dom.minidom.parseString(s).toprettyxml())'

saxon-lint

You need saxon-lint:

echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    saxon-lint --indent --xpath '/' -

saxon-HE

You need saxon-HE:

 echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' |
    java -cp /usr/share/java/saxon/saxon9he.jar net.sf.saxon.Query \
    -s:- -qs:/ '!indent=yes'
2019/01/21
931
1/21/2019 12:49:53 PM

xmllint --format yourxmlfile.xml

xmllint is a command line XML tool and is included in libxml2 (http://xmlsoft.org/).

================================================

Note: If you don't have libxml2 installed you can install it by doing the following:

CentOS

cd /tmp
wget ftp://xmlsoft.org/libxml2/libxml2-2.8.0.tar.gz
tar xzf libxml2-2.8.0.tar.gz
cd libxml2-2.8.0/
./configure
make
sudo make install
cd

Ubuntu

sudo apt-get install libxml2-utils

Cygwin

apt-cyg install libxml2

MacOS

To install this on MacOS with Homebrew just do: brew install libxml2

Git

Also available on Git if you want the code: git clone git://git.gnome.org/libxml2

2019/01/27

You can also use tidy, which may need to be installed first (e.g. on Ubuntu: sudo apt-get install tidy).

For this, you would issue something like following:

tidy -xml -i your-file.xml > output.xml

Note: has many additional readability flags, but word-wrap behavior is a bit annoying to untangle (http://tidy.sourceforge.net/docs/quickref.html).

2014/10/12

You didn't mention a file, so I assume you want to provide the XML string as standard input on the command line. In that case, do the following:

$ echo '<root><foo a="b">lorem</foo><bar value="ipsum" /></root>' | xmllint --format -
2013/04/18

Without installing anything on macOS / most Unix.

Use tidy

cat filename.xml | tidy -xml -iq

Redirecting viewing a file with cat to tidy specifying the file type of xml and to indent while quiet output will suppress error output. JSON also works with -json.

2020/04/22

xmllint support formatting in-place:

for f in *.xml; do xmllint -o $f --format $f; done

As Daniel Veillard has written:

I think xmllint -o tst.xml --format tst.xml should be safe as the parser will fully load the input into a tree before opening the output to serialize it.

Indent level is controlled by XMLLINT_INDENT environment variable which is by default 2 spaces. Example how to change indent to 4 spaces:

XMLLINT_INDENT='    '  xmllint -o out.xml --format in.xml

You may have lack with --recover option when you XML documents are broken. Or try weak HTML parser with strict XML output:

xmllint --html --xmlout <in.xml >out.xml

--nsclean, --nonet, --nocdata, --noblanks etc may be useful. Read man page.

apt-get install libxml2-utils
apt-cyg install libxml2
brew install libxml2
2018/09/27