Regex exercises -- render human-readable one-line xml

Last updated on 2012-07-12 19:18 UTC

back to stuff

You can find the sed scripts in this GitHub repository

Problem Try to render human-readable a one-line XML

Solution This can be done in steps:

  1. insert new line character after the XML declaration
  2. insert new line character after the DOCTYPE declaration
  3. insert new line character after an empty element
  4. insert new line character after an end element

The tricky part is to insert new line characters between start elements. In the way sed works,
it seems it cannot work on superimposed matches, so to break more than stow start elements. I work around
this problem by calling sed multiple times until the input is the same of the output (i.e. no more substitution possible).
Probably this isn’t the best solution if the input file is huge.

The exercise is container in the xml folder.

back to stuff