I have 2 different folders: “user_guide-1.7.0” and “user_guide-2.0.0”
in them, all html files, have different versions.
I want to find differences, but I am not concerned about the differences in the header sections of files, for footer information etc.
What I want to achieve is that, to get difference in the parts of files, starting with
<!-- START CONTENT →
end, ending with
<!-- END CONTENT →
To be able to use diff program according to my purpose, I somehow need to pull only that part of files in both versions, and only compare the parts that I pulled. Only thing that I do not know how to do it is to extract related parts from the html documents.
So my question can also be changed to this:
How do I get parts starting with: <!-- START CONTENT →
end, ending with: <!-- END CONTENT →
You could look into the csplit tool (of course* man csplit*). It can split a file according to a pattern found. In your case the original pairs of files should be split into 3 pairs of files, where you can then do the diff on the second parts. Throw them away before going for a new pair.