Shlomi Fish
2014-08-08 08:58:53 UTC
Aa
Hello docbook-apps.
I have a large set of projects that I am looking to scrub for unused
graphics and XML files prior to sending off to localization.
Some of my colleagues have created some very basic bash and batch
scripts to scan through the folders and find files that arenât referenced
in any of the source files so we can delete them, but I worry that these
scripts donât catch everything (unused XML files in the base directory that
reference images will âblessâ this images) and we could still have
extraneous files left over or accidentally delete important ones
unknowingly.
Each project has a book.xml file that is the gold master for the
outputs. If the book.xml file or any of its includes doesnât reference a
file in the project, itâs safe to delete. I was hoping that I could use
xmllint to tell me which files are loaded when I try to validate the
book.xml, but I havenât found the magic formula yet.
Iâve tried the following command to reference all of the loaded files
during a pass, but it doesnât seem to list the image files referenced,
which is mostly the point of this exercise, and I get a lot of noise from
the module files for the DTD on every include.
$ xmllint --load-trace book.xml --xinclude --noout &> test1
Has anyone had a similar problem to solve? Am I going about this the
right way?
Thanks, and Iâm open to any suggestion. If bash and xmllint donât work
here, I am partial to Python as an alternative. Just saying.
*Eric Nordlund*
Senior Technical Writer
Amazon Web Services
[image: Description: Description: New Picture]
I have a large set of projects that I am looking to scrub for unused
graphics and XML files prior to sending off to localization.
Some of my colleagues have created some very basic bash and batch
scripts to scan through the folders and find files that arenât referenced
in any of the source files so we can delete them, but I worry that these
scripts donât catch everything (unused XML files in the base directory that
reference images will âblessâ this images) and we could still have
extraneous files left over or accidentally delete important ones
unknowingly.
Each project has a book.xml file that is the gold master for the
outputs. If the book.xml file or any of its includes doesnât reference a
file in the project, itâs safe to delete. I was hoping that I could use
xmllint to tell me which files are loaded when I try to validate the
book.xml, but I havenât found the magic formula yet.
Iâve tried the following command to reference all of the loaded files
during a pass, but it doesnât seem to list the image files referenced,
which is mostly the point of this exercise, and I get a lot of noise from
the module files for the DTD on every include.
$ xmllint --load-trace book.xml --xinclude --noout &> test1
Has anyone had a similar problem to solve? Am I going about this the
right way?
Thanks, and Iâm open to any suggestion. If bash and xmllint donât work
here, I am partial to Python as an alternative. Just saying.
*Eric Nordlund*
Senior Technical Writer
Amazon Web Services
[image: Description: Description: New Picture]