« Back

MS Office File Reading and Writing Library

Industry: Office Productivity

Overview:

Synerzip’s client Quickoffice is the worldwide leader in mobile office solutions, which got acquired by Google in June 2012. Their flagship, award-winning software allows mobile users to view, edit and create Microsoft® Word, Excel and PowerPoint documents on their mobile/Desktop devices.

Synerzip’s Role:

Synerzip developed the Java APIs for reading and writing MS Office file formats.

The Java APIs were built for MS Office 2007 (OOXML format) and later extended for MS Office 2003 (Binary format).

This library is a Java Document Processing Engine which works on any Java Platform (including Android).

Challenges & Solutions:

The initial challenge was to understand the complete Open Package Conventions and Office Open XML standard (ECMA-376) and design a solution which could read and write Office files adhering to (ECMA-376). The other challenge was to map an older document (MS Office 2003 and older) format to a newer one (MS Office 2007) with no data loss or corruption, which was overcome by a lot of research and debugging. Team also needed to constantly work with documents having incorrect specs, which was dealt by ignoring (as far as possible) the part of the document not adhering to specification.

Lack of proper specification of Office 2003 format was also challenging, which was overcome by reverse engineering the Office 2003 specs.

It was difficult to finely balance between Specification Adherence and Project delivery, which was handled by prioritizing tasks with delivery schedules.

Later this library became the basis for all Office products at Quickoffice. It was automatically(to a large extent) ported from Java to C++ using an internal conversion tool named Chameleon (for the main domain classes). Rest of the Engine code was converted manually for performance reasons.

At the Quality Assurance level too, managing the ever-growing documents and testing the Java APIs across was cumbersome, but Synerzip team was able to overcome this too by proper QA automation.

Following tasks were automated

  1. Collecting a large number of random Office documents from the internet and adding them to a suite.
  2. Screenshot generation of Office documents from suite before and after opening and saving the file using Java Library.
  3. Comparison of images generated in step-2 with a certain tolerance for an automated report.
  4. A detailed report generation which indicates features being tested and till what level.

Result:

The team was able to deliver high performance, fault tolerant Java library of high quality with very less supervision.

Technologies Used: Java, Junit, VbScript/ AppleScript for Microsoft Office Automation

Industry

Latest from the Blog

How can Synerzip Help You?

By partnering with Synerzip, clients rapidly scale their engineering team, decrease time to market and save at least 50 percent with our Agile development teams in India.