What it’s like open sourcing software written for the federal government
Recently I released a project I built for the CDC as open source out into the world. It’s called CDC Maps, and I wrote a case study about it on my portfolio here. It’s definitely one of the biggest projects I’ve ever built and I really thought it would be valuable to release it as open source. This presented some interesting challenges and I thought it’d be valuable to write about it.
A little history
There has been a big push in recent years for the federal government to open source more code. In August 2016, Obama’s white house issued M-16-21 which was a memorandum encouraging federal agencies to develop procedures around open sourcing code. I believe there was also a mandate to try to open source 20% of custom code written for a certain amount of time.
I’m fuzzy on how all of this was enforced, but I do know from my research that it led big federal agencies to develop their own sites, tools and guidelines for open sourcing software. NASA created code.nasa.gov and the CDC created open.cdc.gov. There’s also code.gov which serves as kind of an aggregator for open source work released by the government. The fact that all this existed made it way easier to consider open sourcing my code. Without it, the amount of red tape probably would have made me shelve the idea entirely.
The CDC also has internal advocates for open sourcing, so I was able to carve out some of my time at work after launching the project to dedicate specifically to preparing to open source it. This was a huge help, it meant I didn’t have to do it on nights and weekends.
Of note, an interesting side effect of all this push for open source work is that when I was tasked with architecting a rebuild of the mapping tool, I was basically told that I cannot use proprietary software at all. Certain centers inside the CDC do choose to use their budget to pay for software that has mapping capability, but I work on a team that is building software to make available for all the different centers to use.
I had spent some time researching the current landscape and there were some decent options out there but they were all paid – either a big up front license fee or an API with usage metering. These weren’t ideal for the business purpose especially since the codebase for the existing version was already using the open source D3 library. I don’t want to get too far into the technical nitty gritty of it, but using vanilla D3 was going to be really challenging.
It’s an old library with a big learning curve and its API has changed a bunch. One of the big considerations in doing a recode was using more modern technology that would be easier for a lot of developers to pick up and maintain and raw D3 just seemed like it wouldn’t work well for that. Luckily, I found an open source project that solved the major pain points by abstracting out D3’s Geo library into some React components called React Simple Maps.
I was already planning on using D3 inside React, but the problem is that they both directly manipulate the DOM and I was going to have to use some kind of glue code like react-faux-dom but that still would have required using D3’s function calls and didn’t solve the problem I talked about above, plus seemed overly complicated. I am very grateful for React Simple Maps. And if someone hadn’t chosen to open source that work, it wouldn’t have been available for me to use and avoid a lot of pain and extra development time. It’s in that spirit I really wanted to publish the work I did as well.
The first step was stripping out all of the CDC specific stuff. This was the vast majority of the time I spent. The maps are loaded on cdc.gov through an iframe using an internal system we call the “widget loader” and it was designed and built to be a completely standalone React application. The way people edit the maps and publish them interfaces directly with the backend.
Additionally, there was a backwards compatibility system to read and convert old config files from the previous mapping system plus the CDC had a lot of specific requirements that weren’t a good fit for a general release. I kind of had to wear my product hat and think about a cohesive set of functionality that other people would want to take, use and build upon.
I converted all of it into an actual component that could cleanly accept props and modify the editor to let people enable it with a simple boolean and copy config JSON when they’re done. This also involved rewriting a lot of CSS to be more self contained and removing viewport relative units like
The end result is that the codebases for the two projects has diverged. There really wasn’t a way to avoid that. I actually think it’s healthy, as new features developed on both sides can be moved between each other but they are no longer 1:1 linked. If the CDC really wants something niche I can build it but don’t have to worry about releasing it to a wider audience.
Will anybody care?
The main benefit from open sourcing code is having other developers add features to it as well, and if nobody uses it or cares, then it doesn’t matter. You can’t just build something with super specific requirements for your day job, strip out private server info and passwords and throw it on Github. Unfortunately, I saw a fair amount of repos that fell into this category.
This was the second big part of the process. I wanted to make it accessible for other people to use to even have a hope of it gaining any sort of traction.
I spent a fair amount of time on the README and making sure that people building React apps can quickly pull this off the shelf and use it. My main thought was “What would I want as a developer?” and I want to quickly be able to install it and get it working to play around with it and see if it’s actually a good fit for what I need. I think the project does great on that front.
Finally, the bureaucracy
This was the most intimidating part: getting it through the approval process. Luckily, the CDC has some people internally who are very passionate about open source software. These people have developed guidance and were very helpful to ask for next steps. I also had to deal with the contracting company I work for, Northrop Grumman, and get it cleared through them.
The main concern in open sourcing code from their perspective is security. Does it have known vulnerabilities? Does it have private server information or passwords? I had to get the code scanned by an automated tool, make sure there were no NPM vulnerabilities and have other developers look at it to make sure it’s okay. I also had to make sure the license was correct and I had a bunch of appropriate disclaimers and other text in my README.
Finally, I had to submit that I had done all of this to my team’s Information System Security Officer (ISSO) and get final clearance from him.
All of this sounds like a lot— and it was, but it could have been way worse. I think this would have been way harder to do 3-4 years ago which is really encouraging.
So CDC Maps is published! You can view the repo here.
I plan to maintain it for as long as I work on it and I hope it’s valuable to other people. I already have ideas for a lot of things I want to add in the next release. Things like unit tests, automated linting and some additional refactoring that is scheduled for the main codebase. I think it’s going to be a great way for me to improve my skills while hopefully providing value to other people.