Bicho developer onboarding assessment

20 November 2013, by Sumana Harihareswara, now of Changeset Consulting

The following is an inventory Sumana Harihareswara made for the maintainers of Bicho, part of the MetricsGrimoire suite. I was contributing to Bicho while improving my programming skills at the Recurse Center. MetricsGrimoire's maintainers had asked what blockers were stopping users and contributors from being happy with the tool, so I responded on the public mailing list with this report on the the state of the Bicho codebase and the project's friendliness to new developers.

I think you asked a good question. I'm a code contributor rather than an end user, but I've found several blockers to contribution (focusing on Bicho in particular). I thought it might be useful to others for me to gather them into a list. I've also filed them as GitHub issues, and sometimes gone into more detail there.

  • Of the seven maintained backends, only two have automated tests, and those are limited. Some components, such as the database integration and configuration management, don't have any tests. It's tougher to make big changes or write a new backend without regression testing. This is the biggest blocker for me. (https://github.com/MetricsGrimoire/Bicho/issues/107)
  • We should convert existing screen-scraping backends to use APIs and general-purpose libraries like Requests and xmlrpclib. (https://github.com/MetricsGrimoire/Bicho/issues/108) One backend (Launchpad) uses an API while most don't. The Launchpad module (very reasonably!) uses the launchpadlib library, which returns Python objects, rather than using a more general tool such as the Requests package http://docs.python-requests.org/. So if you want to write a new backend or change an old backend to get data from an API, it's hard to use the Launchpad backend as an example of how to write a backend that grabs data from a web API.
  • The backends seem to duplicate a lot of code from each other instead of inheriting from common classes, making it harder to write a new backend. (https://github.com/MetricsGrimoire/Bicho/issues/39)
  • Some backends inherit from the Backend class and some don't. This also hinders refactoring and understanding. (also in issue 39)
  • There's a mix of old- and new-style classes, which can also be a gotcha and hinder refactoring and understanding. (https://github.com/MetricsGrimoire/Bicho/issues/109)
  • The fact that right now there is a "bicho" file and a "Bicho/" directory in the toplevel is causing problems for people on case-insensitive filesystems who want to pair program with me. (https://github.com/MetricsGrimoire/Bicho/issues/12) Most of my peers use Macs, so this blocks them.
  • As Alvaro said in IRC, the ORM integration (with Storm) is "duplicating code, creating an unneeded layer with *DB objects." This doesn't use the ORM in the best way, since "the idea with object databases is that objects are directly stored in the db, not that you need to create another layer with the db objects. I have reached this conclusion working with new backends ... and having to duplicate a lot of code for persistence." So I think we need to check that we're using Storm the way it's meant to be used. (https://github.com/MetricsGrimoire/Bicho/issues/110)
  • Storm is less popular than SQLAlchemy, so it's harder to find help when working with it. (https://github.com/MetricsGrimoire/Bicho/issues/111)
  • There are conflicting database schema docs. (https://github.com/MetricsGrimoire/Bicho/issues/112)
  • Bicho only seems to work well with MySQL, so it's harder to do quick setup/teardown for tests; it would be nice if it worked better with SQLite. (I could be wrong on this; does it actually work just fine with SQLite?)

Since I started contributing to Bicho six weeks ago, I've also run into underdocumented code or obsolete docs/comments, or stylistic inconsistencies in the code, but mostly I've been able to fix those myself, especially with the kind and abundant help from maintainers via the mailing list or IRC.

But I'm just not a proficient enough programmer yet to work productively with this codebase, given the blockers I've mentioned. The points I listed may not seem serious to more experienced programmers, but I'm still learning and they're blockers for me. :( At this point I am putting aside my previous goal of writing a Trac backend (my notes are in https://github.com/MetricsGrimoire/Bicho/issues/113 in case anyone wants to pick that project up).

I hope this doesn't come across as blame or negativity. I do hope to contribute to Bicho again in the future when I'm a better programmer and when Bicho has fixed a few of the blockers I mentioned.

Sumana Harihareswara

The project's maintainers thanked me. Bicho has since been superseded by Perceval.

Changeset can deliver this kind of developer onboarding audit for your project, to help you prepare for interns or other new contributors. Get in touch for a free initial 30-minute chat.