Open Source ROI Model

Here are some loose thoughts on Open Source contribution Return On Investment (ROI):

I (Tim Bird) once created a game called TuxBucks which tried to model the dynamics of open source ROI.

Below is a list of some factors affecting open source ROI, which I think would be good to model. I have no idea, for most of these, how to come up with realistic values for these factors.

Overview
It is often stated by open source adherents that there are justifiable business reasons for contributing to open source, independent of the obligations of the license or a philosophical sense to return value for value received.

Some of these reasons are:
 * reduced maintenance cost over time
 * code improvements for contributed code

This page describes some factor affecting the return on investment that an individual or company might expect from open source. It tries to define factors which affect the long-term costs and benefits of utilizing open source software, and contributing to its development. It is intended that this include consideration of direct costs and received value, as well as the role of network effects in increasing the value of shared code bases over time.

Right now, this is just a loose collection of ideas for things that affect ROI. Maybe someday, I'll turn this into a more formal paper, and try to collect metrics for some of the factors mentioned below.

What I'd really like to figure out, is whether there is some quantifiable "critical mass" of participation needed in order to keep open source alive and relevant, that could be communicated to companies as a participation recommendation. --TimBird 21:16, 31 May 2007 (EEST)

Big overview

 * Total return from open source = value of code that is relevant to your product, minus acquisition and integration cost, minus participation cost


 * Let PRCV = value of code relevant to your product (Product-Relevant Code Value)
 * Let AC = acquisition cost
 * Let IC = integration cost
 * Let PC = participation cost
 * the formula becomes ROI = PRCV - AC - IC - PC


 * Acquisition cost = license fees plus subscription fees plus downloading costs
 * downloading costs = cost to train inexperienced engineers in downloading plus cost to download
 * Integration cost = finding cost plus conflict resolution cost plus adaptation cost plus quality assurance cost
 * Integration cost also includes integration with 3rd-party software (not modeled yet, above just reflects integration of non-mainlined patches)
 * Finding cost = cost to train inexperienced engineers in finding plus search time plus downloading cost plus evaluation cost
 * conflict resolution cost = cost to resolve patch conflicts
 * Conflict resolution depends on patch series breadth (patch locality), and magnitude of version gap (among other things)
 * Adaptation cost = engineering cost to adapt existing code to current situation
 * Adaptation may involve re-writing existing code, or developing new code (drivers) where they don't exist
 * Adaptation is the programming companies have to do to fill in the gaps left by open source code
 * Adaptation costs increase as hardware is customized, and decrease as commodity hardware is used
 * Adaptation costs likely decrease when older hardware is used (due to software "showing up" for hardware over time)

Issues regarding Relevant code

 * the value of relevant code depends on the total size of the available code, and the uniqueness of your hardware
 * value is roughly equal to the amount of time it would take to write the code yourself or license it from someone
 * hence, value of code may depend on market conditions (eg. does anyone else offer code to do the same thing?)
 * there could be value diminishers (such as code that detracts from use in your product, but which you don't have time to remove, which you wouldn't have written yourself)
 * total size of available code depends on the contribution amount and contribution value (and contribution rate, over time)
 * contribution amount depends on size of contributor pool
 * percentage of relevant code depends on amount of contributors in your market area (this is what CELF tries to affect)
 * Can code areas be isolated and analyzed independently?
 * eg. Can you look at just processor-specific or board-specific contributions, and measure these separately from the main code base?
 * size of relevant code depends on network size (and network quality)?
 * maybe network size should not be measured in people, but by contribution rate?

Effect of mainlining code

 * what costs go up in order to mainline code?
 * cost to generalize code (very hard to gauge?)
 * cost to convert code to submission standards
 * cost to forward-port code to the latest project version
 * may be unavoidable if your code is dependent on other items (especially board support or architecture code)
 * could factor in cost to forward-port the board support or architecture code
 * what costs go down as a result of mainlined code?

Effect of not being current with your source code version

 * What is effect of not being current?
 * raises mainlining cost (see mainlining code costs)

Effect of experience with open source

 * What are switching costs for developers entering open source work?
 * How expensive is it to learn diff+patches or git, for example?
 * What is the magnitude of the learning curve for participating?
 * Does the learning curve increase over time (I believe it does, as complexity of code bases grow)
 * Can this be measured by examining contribution rate from individuals over time (if they stick with it)?
 * probably not, since so many factors affect contribution rate (enthusiasm, development speed, etc.)

Dynamics of network effects

 * What number of contributors is needed before open source becomes useful?
 * number of contributors affects amount of contributions
 * skill of contributors affects value of contributions
 * amount and value of contributions affects the total code base size
 * total code base value affects likelihood of code relevance to product (really?)
 * Can this be modeled, realistically?

Dynamics of participation costs

 * as participation increasing, training cost go down
 * finding cost is reduced
 * mainlining cost is reduced
 * integration cost is reduced??
 * as participation changes, does it affect other contributors?
 * is there an "ignition" point or "stall" point for contributors?
 * That is, if the contribution rate is below a certain point, do contributors wander off and abandon the project?
 * Are there generic parameters that affect this point, or is it project-specific (or market-specific)?

Miscellaneous

 * What about "quantum effects"?
 * a single developer or single company may significantly affect the outcome
 * How do you predict the future when it is completely dependent on individual contributions?
 * Not trying to predict an individual outcome.
 * e.g. Can't predict how long to invent the light bulb
 * Can't predict when a particular driver will be written (or a particular problem solved)
 * Is it possible to predict the code size of the Linux kernel for 2008, 2009, etc.?
 * Is it possible to predict the number of supported ARM platforms in the future?
 * As granularity increases, quantum effects dominate
 * So, what CAN be measured?
 * Trying to get an estimate for amount of code that will not have to be written or bought, in future
 * Trying to determine factors which affect that, so it can be increased