Open Source ROI Model

Revision as of 18:19, 31 May 2007 by Tim Bird (Talk | contribs)

Jump to: navigation, search

Here are some loose thoughts on Open Source contribution Return On Investment (ROI):

I (Tim Bird) once created a game called TuxBucks which tried to model the dynamics of open source ROI.

Below is a list of some factors affecting open source ROI, which I think would be good to model. I have no idea, for most of these, how to come up with realistic values for these factors.


It is often stated by open source adherents that there are justifiable business reasons for contributing to open source, independent of the obligations of the license or a philosophical sense to return value for value received.

Some of these reasons are:

  • reduced maintenance cost over time
  • code improvements for contributed code

This page describes some factor affecting the return on investment that an individual or company might expect from open source. It tries to define factors which affect the long-term costs and benefits of utilizing open source software, and contributing to its development. It is intended that this include consideration of direct costs and received value, as well as the role of network effects in increasing the value of shared code bases over time.

Right now, this is just a loose collection of ideas for things that affect ROI. Maybe someday, I'll turn this into a more formal paper, and try to collect metrics for some of the factors mentioned below. --TimBird 21:16, 31 May 2007 (EEST)

Big overview

  • Total return from open source = value of code that is relevant to your product, minus acquisition and integration cost, minus participation cost
    • Let RP = value of code relevant to your product
    • Let AC = acquisition cost
    • Let IC = integration cost
    • Let PC = participation cost
  • the formula becomes ROI = RP - AC - IC - PC
  • Acquisition cost = license fees plus subscription fees plus downloading costs
    • downloading costs = cost to train inexperienced engineers in downloading plus cost to download
    • Integration cost = finding cost plus conflict resolution cost plus adaptation cost plus quality assurance cost
      • Integration cost also includes integration with 3rd-party software (not modeled yet, above just reflects integration of non-mainlined patches)
    • Finding cost = cost to train inexperienced engineers in finding plus search time plus downloading cost plus evaluation cost
    • conflict resolution cost = cost to resolve patch conflicts
      • Conflict resolution depends on patch series breadth (patch locality), and magnitude of version gap (among other things)
    • Adaptation cost = engineering cost to adapt existing code to current situation
      • Adaptation may involve re-writing existing code, or developing new code (drivers) where they don't exist
      • Adaptation is the programming companies have to do to fill in the gaps left by open source code
      • Adaptation costs increase as hardware is customized, and decrease as commodity hardware is used
      • Adaptation costs likely decrease when older hardware is used (due to software "showing up" for hardware over time)

Relevant code dynamics

  • the amount of relevant code depends on the total size of the available code, and the uniqueness of your hardware
    • total size of available code depends on the contribution amount and contribution value (and contribution rate, over time)
    • contribution amount depends on size of contributor pool
    • percentage of relevant code depends on amount of contributors in your market area (this is what CELF tries to affect)
  • Can code areas be isolated and analyzed independently?
    • eg. Can you look at just processor-specific or board-specific contributions, and measure these separately from the main code base?
  • size of relevant code depends on network size (and network quality)?
    • maybe network size should not be measured in people, but by contribution rate?

Effect of mainlining code

  • what costs go up in order to mainline code?
    • cost to generalize code (very hard to gauge?)
    • cost to convert code to submission standards
    • cost to forward-port code to the latest project version
      • may be unavoidable if your code is dependent on other items (especially board support or architecture code)
      • could factor in cost to forward-port the board support or architecture code
  • what costs go down as a result of mainlined code?

Effect of not being current with your source code version

  • What is effect of not being current?
    • raises mainlining cost (see mainlining code costs)

Effect of experience with open source

  • What are switching costs for developers entering open source work?
    • How expensive is it to learn diff+patches or git, for example?
    • What is the magnitude of the learning curve for participating?
    • Does the learning curve increase over time (I believe it does, as complexity of code bases grow)
  • Can this be measured by examining

Dynamics of network effects

  • What number of contributors is needed before open source becomes useful?
    • number of contributors affects amount of contributions
    • skill of contributors affects value of contributions
    • amount and value of contributions affects the total code base size
    • total code base value affects likelihood of code relevance to product (really?)
  • Can this be modeled, realistically?

Dynamics of participation costs

  • as participation increasing, training cost go down
    • finding cost is reduced
    • mainlinging cost is reduced
    • integration cost is reduced??


  • What about "quantum effects"?
    • a single developer or single company may significantly affect the outcome