This week's book giveaway is in the OCAJP 8 forum. We're giving away four copies of OCA Java SE 8 Programmer I Study Guide and have Edward Finegan & Robert Liguori on-line! See this thread for details.
We do a lot of our automated testing with the help of excel. We have 2 applications that do a lot of background analysis of financial data. We have functional "testers" whose expertise is more domain knowledge than technical. Their technical knowledge is that they can a) use windows well enough b) they are very good in excel, good enough that they know how to implement the business rules of the applications in excel c) they know how to write sql queries. They cannot program.. Some of them might be able to write VBScript macros in excel, but that's the extent of it
So, to automate the testing done by the functional team, we created an Excel test driver. Basically, the functional testers define all the inputs and expected outputs in an excel file. The test driver loads in the inputs, runs the engine that does the processing, and then checks the expected output with the actual output in database. The testers create an excel file for every test point, and the CI server runs em nightly. Over time, the number of excel cases have grown to be in the thousands. At this point, the number of excel cases has far surpassed the number of java files.
The problem is that we were thinking of having branches, and we couldn't figure it a way how we will manage conflicts in the excel files. SVN sees them as binary files, and is not able to auto merge them. Is there a tool that can do a 3 way merge on excel files?
I'm afraid it's not going to be easy, if possible at all.
I'd suggest starting with subversion docx merge. The docx is new Word 2007 format which is actually a ZIP file containing XMLs and some small binary files. Similarly, XLS can be stored as XLSX or XLSM (with macros) in Excel 2007. If you're stuck with an older version of Office, it should still be possible to install (Microsoft's own) converters that allow opening/saving these new formats. You might also search for subversion xlsx merge, but it seems that docx gives more hits (people version Word documents more often, it seems).
It looks like there are some specific tools that directly support 3-way merge of Word documents (so no luck with Excel), but more generic solutions based on diffing the XML files inside individual documents might perhaps be available as well.
Mercurial has a ZipDoc extension that also utilizes the fact these documents are zipped up XML files, but the aim there is to reduce repository size by using more effective textual diff to store deltas. Merging is still problematic, as there isn't support for updating ZIP metadata if the merge changes sizes of the XML documents. But it might serve as an inspiration (and if you do find a solution that works in Mercurial, you might perhaps connect Mercurial to SVN).
Thanks Martin. So, XLSX files are basically zip files that contains XML files. Also Excel can store the excel files as ODS which are also zip files that contain xml files. So, thereotically, we could unzip, merge and then zip. However, the problem is that when there are conflicts, the diff would be on the XML, and looking at the XML it is very messy because the XML mixes up the content with the style. we don;t care about the styles, only the content. Although it will be messy, it's a possible solution
So, possible options are
a) Implement a tool that can merge XLSX by unzipping, using diff3 to merge the xml and then zipping it back again
b) Implement a tool that can merge ODS by unzipping, using diff3 to merge the xml and then zipping it back again
c) Implement my own Excel merge tool that uses POI to read the contents of the file, and performs a 3 way merge
a) and b) might be messy but will be easier to implement.