Distributed document creation - Chris' Externalized Inner Monologue

One of my jobs involves creating documents. >10MiB PDFs, or piles of dead trees, if you will. Big documents with several contributors, providing revision after revision of their part of the cake.

The process looks roughly like this:

The coordinator releases a structural guideline, with annotations indicating which part belongs to which contributor. To a certain degree, contributors may change the internal structure of their respective sections, while bearing consistency with related sections belonging to other contributors in mind.

Files: structure.docx
Contributors write the first revision of their sections and supply it to the coordinator and other contributors for peer review.

Files: structure.docx, contribution_1.docx … contribution_n.docx
Contributors refactor their sections and submit the new revisions for peer review again, as in step 2. Repeat this step as many times as necessary.

Files: structure.docx, contribution_1_rev1.docx … contribution_n_revM.docx
The editor incorporates the latest (and hopefully last) revision of each contribution into the document, outlined by structure.docx.

Files: structure.docx, contribution_1_rev1.docx … contribution_n_revM.docx, document.docx
At this point, the involved parties can inspect the compiled document for the first time in its near-final form. Of course, some things aren’t quite right. So until they are, repeat steps 3-4.

Documents are distributed via e-mail and an online project management tool with basic document management capabilities (you can upload and group files).

What’s wrong?🔗

A structural outline is provided (which is good), but there is no clear style directive. The contributors format and style their contributions however they want, which leads to varying styles when it comes to citations, language, and structural decisions (Does this get its own caption? Should I write long sentences or short ones?).

As a result, the editor spends a great amount of time on assimilating the various styles from the contributions, so it all fits into a common document. This involves restructuring, and even rewriting.
Maybe this doesn’t deserve its own point, since it can be called the cause of the above problem. It’s Microsoft Word. It moves the contributors’ focus away from what they actually need to do.

Their job is to produce content. Not to format it in such a fancy way that’s guaranteed to break as soon as someone tries to incorporate it into the main document.
It may become hard to trace a section back to its original author as soon as the contributions get entangled in the main document. This can be a problem if there are problems with the content, and the original author of this topic would be the ideal person for the job to fix them.

The solution🔗

What I’m about to propose is by no means the via regia for any attempt to compile a document from various contributions, but it may be a nice approach for technology-inclined people who find themselves in a situation such as mine, where a multi-author document needs to be compiled.

Set up a source control system (SCS). Choose whatever suits your organisation’s religion. It doesn’t matter if it’s git, mercurial, svn, or anything else centralized or decentralized, as long as it tracks changes and the identities of those who made them. If less-technical people are involved, you might want to prefer the SCS with a beginner-friendly GUI or write a small task-specific tool.
In the SCS-tracked directory, create one folder for each contributor.
Set up a file hierarchy for a LaTeX document. Take care to split the parts of the main document into many files. Store those files in the directory of the contributor responsible for creating the content for the section in question!_
Supply the contributors with SCS access and, if necessary, instructions on how to get started with the SCS and LaTeX.

Pros and cons🔗

Pro: The current state of the main document is available at any time, without the need for manual integration.

Pro: Formatting is delegated entirely to the editor and the LaTeX compiler. Contributors can focus on writing content.

Pro: SCS provides insight on which parts of the document were produced by who. Also, there is a clear timeline of changes. The editor can switch back and forth between several versions of a section without changing anything in the rest of the document, with zero integration effort.

Con: The majority of the population has no experience with LaTeX, and, due to the paradigm shift compared to WYSIWYG text editing, will probably have a hard time getting used to it. The additional effort to train those people might not outweigh the benefits of this solution.

Con: Source control systems, just like LaTeX, may be unknown and hard to understand for many people.

Con: It’s considerably hard to convince the ones in charge to choose such an unconventional approach. Especially if a significant amount of money and a tight schedule are involved, decision makers tend to stay conservative, even in the face of a very promising alternative approach.

Conclusion and disclaimer🔗

The presented solution enables a group of contributors to colaboratively create a large document, while allowing writers to focus on content, and delegating formatting to editors.

Compared to a classic docx sharing approach, the cost of integrating contributions into the complete document is zero, as it is done automatically by checking out of the SCS and running the LaTeX compiler. This allows for easier review of the complete document by all parties involved.

I am by no means an expert on this matter. I just find myself in a situation where using this approach would make my life a lot easier (in the roles mentioned above, I’m an editor and a contributor), and save time and money.

This is a personal opinion, based on my experience, what I’ve heard from others, and things I read on the Internet.