HtmlDiff

From W3C Wiki

See also


It would be nice to have a comparison of the various tools to see how well they do it for different things like moving large sections, rewrites of sections, only minor differences between versions, changes only visible through view-source like new attributes, changes to white-space but nothing else, etc.


I ran

on two HTML files like this:


<html>
<body>
<h1>Lorem Ipsum</h1>
<p class="c1">
Lorem ipsum dolor sit amet...
</p>
</body>
</html>


  • File 1 had <p class="c1"> (shown above)
  • File 2 had <p class="c2">.

It's not clear what an htmldiff tool should do with such (or similar) input.

  • The python script produced
...<h1>Lorem Ipsum</h1>
<del class="diff modified"><p class="c1"></del><ins class="diff modified"><p class="c2"></ins>
Lorem ipsum dolor sit amet...
The difference was picked up, but not in a usable way: how is this HTML going to be processed?
  • The perl script produced
<html>
<body>
<h1>
Lorem
Ipsum
</h1>
<p class="c2">
Lorem
ipsum
dolor
sit
amet...
Difference not picked up, but maybe this is more usable. 

Other attribute changes would expect to be handled the same way.

Maybe user-friendly inline diff is not possible for HTML? What if the class, or other attribute change, doesn't make any difference in the final display? No diff-tool is going to be able to tell. On the other hand, if there is no visible difference, the user is not going to know either when just shown two apparently identical blocks. Showing the user the class-difference itself would confuse many people.

Maybe use an outright "diff only on the text and images" and indicate differences in layout using IFRAME' (side-by-side view, not inline-highlights)

I didn't look at other tools from the list. -- swehner DateTime(2007-11-30T02:20:09Z)