When building documentation for your product, you will often encounter situations where you need to mix and match a bunch of content that comes from different sources. It becomes a bit more complex when you need to start dealing with different platforms, e.g. documenting APIs for Python, .NET and Java all in one place.
We do all that and more on docs.microsoft.com, where we host documentation that is both hand-written and generated automatically from code - DocFX is a very powerful and versatile system that allows you to do that with the help of pre-defined contracts for structured and unstructured content.
While one can write Markdown for their articles and overviews relatively easily in their favorite editor, they would need to use a set of disjoint tools to generate documentation for their code. In the context of DocFX, there is
sphinx-docfx-yaml for Python,
type2docfx for TypeScript and so on. Each of those tools has their own approach to handling inputs and outputs, that we normally delegate to individual continuous integration jobs. I wondered if there is a better way to do that for an individual that had no inherent knowledge of the underlying DocFX requirements.
This is how the idea for
adg was born. My goal with this project is to build a documentation generation CLI that would allow anyone to produce great code documentation with a one-liner. It’s written entirely in Python, and generating new docs can look like this:
In the example above, I am using the command to generate some Python documentation with Sphinx, and then the
sphinx-docfx-yaml extension to convert the produced content to a DocFX-ready format (YAML) - but you would never know that from running the command itself. So what goes on behind the scenes? Because I started working on the Python component first, that seems to be a great candidate to analyze here.
It’s worth mentioning that not all the logic is in place yet - there is a need to make sure that we account for some runtime specifics (e.g. running on Windows might be ever so different compared to running on a macOS machine), and I probably need to clean up the code. For now, we are focusing on getting things done.
adg.py, I am using the
argparse library to get the inputs from the user. I don’t want to worry too much about configuration settings or any additional files - just give the tool everything it needs to know in the terminal:
The parameters are then validated, and passed to the command processor - an abstraction class that does some magic to figure out things like whether the right parameters were passed, or the command that needs to be executed. The command processor is pretty much just a proxy for user input to the “kernel” of the tool -
coreutil.py is tasked with performing the heavy lifting. For example, it takes a library input, and attempts to install it locally to then push it through the documentation generation process - all within the
Similarly, it then triggers the documentation generation process through an externalized shell script (because DocFX does not have a native Python API):
I am not yet sure whether running the shell script is the ideal approach here - it is effectively spawning another process, and I am yet to check out
execv (thank you to Brett Cannon for pointing me in this direction). It gets the job done, though! What the script hides behind the scenes is a set of commands that:
- Bootstrap a temporary Sphinx documentation project.
- Generate native Sphinx documentation based on the installed Python package.
- Convert the Sphinx output to DocFX-compatible YAML.
Normally, you would need to run this all yourself, but now
adg does it for you. A similar approach would apply to other platforms, like Java, .NET and TypeScript (among others), where
adg is tasked with not exposing the user to the complex set of tools and commands and modifications, rather than replace those.
You can check out the project roadmap on GitHub - I hope to have a Release Candidate 3 build by September 2019 (I thought having three of them before a V1 release was reasonable), that will contain major components implemented, allowing you to generate automated documentation for your APIs in minutes, that you can then just plug-and-play in DocFX.
Have any thoughts? Let me know on Twitter!