Visual Studio Extension to compare Git – History of Microsoft Word (.rtf and .docx) files

After many searches, can’t find. Any idea which extension could help us comparing files in “Git – History” of Microsoft Word (.rtf and .docx) files? Thanks

In general, you’re going to want to use tooling that works with Git, since it will then work with any editor.

RTF files are text, and they look something like this (truncated):

{\rtf1 \ansi
{\colortbl;
\red0\green0\blue0;
\red255\green255\blue255;
\red255\green0\blue0;
\red0\green255\blue0;
\red0\green0\blue255;
\red0\green255\blue255;
\red255\green0\blue255;
\red255\green255\blue0;
\red0\green0\blue128;
\red0\green128\blue128;
\red0\green128\blue0;
\red128\green0\blue128;
\red128\green0\blue0;
\red128\green128\blue0;
\red128\green128\blue128;
\red192\green192\blue192;
}

This can be viewed pretty simply with Git without any help. It’s not super pretty, but assuming you do actually care about different styling, then this is a great way to do it.

.docx files are essentially zip files with XML inside. You may find that using a textconv option can help here. In your .gitattributes file, you can set *.docx diff=docx and then set up the following configuration in your ~/.gitconfig or .git/config

[diff "docx"]
    textconv=pandoc -t asciidoc
    prompt = false

You can also choose to use -t markdown or a different viewer program if you like that better. A similar approach can also be used for RTF files as well if you find the plain source too ugly. Software recommendations are off topic, so I’ll leave further choices about viewer programs as an exercise for the reader.

The only downside with this approach is that because it requires config, it cannot be transferred with the repository or used in web interfaces (like GitHub). That’s because the configuration contains programs, and Git doesn’t allow including that in the repository because untrusted repos could execute arbitrary code, which would be a security problem. You can, however, add a setup script in the repo which sets these options for users who want to use it, though.

Once you’ve done that, you can use all of the standard Git tooling, like git diff or git log (or wrappers around it), and the files will automatically be converted to text in the format you’ve specified before diffing.

Note that in general, you probably do not want to be storing zip files (including Word documents) or PDFs in your repository. That’s because they’re typically compressed and thus typically bloat the repository very badly, since Git can’t deltify or compress them itself. You’re much better off writing your documents as Markdown, AsciiDoc, or a similar format and then converting them (possibly with pandoc or another tool) into the desired format. Git does this with its documentation, and I’ve also used this approach for my own creative writing and it works great in both cases.

Leave a Comment