Friday, January 26, 2007

Enterprise Open Source Policy

One of my colleagues asked me what I had contributed to open-source, lately. I actually haven't contributed any source code to an OSS project for several years but, last year I managed to make a rather rare contribution. I lead a handful of co-workers and we drafted our company's official Open Source Software Use Policy. After a technical, security and legal review, it has become the set of guidelines we use in practice, almost every day. It's probably the most popular standard any EA group I know of has ever written. What follows is a brief synopsis of the policy.

First, we defined several categories of use ranging from just downloading the binary and using the tool all the way to modifying the OSS code and using it on the customer-facing site. Our attorneys were very helpful in this regard as we had to do a risk-benefit analysis from several different viewpoints.

Second, we used the Open Source Initiative's list of licenses and evaluated all of them. We then created a matrix and cross-referenced the license with the category of use. Now this does not mean that just because the license is approved for the category that someone can actually use the software. That's a separate issue.

Third, we outlined a lightweight evaluation process where if the license is approved, the developer and their manager would determine if the software was actually appropriate to use, consult with architecture (something similar might already be pre-approved) and prototype a case. The development manager and the architect can both approve the use of the OSS code in production.

Fourth, we defined a simple process for making contributions back to the OSS community. Simply, development, architecture and legal would review the code and if approved, the company's governing board would be asked to allow the submission back to the project. We essentially piggy-backed on the same process already in place for identifying possible patents and trade secrets, a process most public companies already have.

Fifth, we created the guidelines for a company-sponsored OSS project. This included identifying one senior developer as an ambassador for the project.

We also set policy about extracurricular coding, and weaved that into the employment agreement.

Finally, we also describe a policy where the procurement of OSS software is no different than the procurement of any other software and it gets managed like any other company asset.

The resulting benefits of this policy have been very positive and measurable. We have cataloged all OSS in use, all of it is properly maintained and version-controlled and plenty of patches, bug-fixes and documentation have been submitted to a number of OSS projects. Everybody wins.

I have even personally used the policy as a recruiting tool when asked what the company policy is regarding OSS. It has indeed sealed the deal for several people we have hired.

Thursday, January 25, 2007

Horizontal Scalability - See I'm Not Crazy

When the gents at eBay gave their presentation at SD Forum 2006, it was great timing for me. First, it provided independent working proof for several concepts we were prototyping, thereby verifying we were on the right track and not undertaking some "wild EA research project". Yes, that's a quote. It's also rather gratifiying to know that eBay has in production some of the same technology we had selected for our stack, specifically for operations management. Second, I had published material to counter the cries of the Luddites and other technologists that need to be dragged into the 1990's. Third, it's pretty much confirmed a lot of ideas I had about how to horizontally scale applications. I won't go into the differences between distributing an application vs. a distributed application because that's material for another blog post. However, distributing the data is just as interesting.

Just looking at the data from our critical search paths based on http traffic, it was obvious we needed to horizontally partition our data. The eBay paper provided the data I needed to finally tip the scales and get this project underway. We had long since realized we had to vertically partition data because disk-based RDBMSes have been too slow from day one. This fact has been verified time and time again since we have at least 3 differnent kinds of caches on top of read-only banks of DB servers. This begs the question of why use a cache at all? Are we using it to avoid hitting the database or, are we really using it to get the persisted data as close to the application as possible? The difference between the two viewpoints is critical, as it turns out. We prototyped a way to replace the read-only farms of RDMBSes with something a lot less expensive. We can trade 10 licenses for 2 and re-allocate 60% of the hardware. Over the next couple of weeks, we will prove the concept. This could be very interesting, especially if we can save a couple million on licensing costs, next year.

Tuesday, January 23, 2007

Transparency Specified

I was chatting with a colleague today and he's working on a blogging policy for our company. Strangely enough, it's about internal company blogging, which was pretty anti-climactic. There was no mention of either employees blogging to the public or empolyees, on their own time, blogging to the public.

In contrast, I came across an ad today for a Chief Archiect today and one of the requirements included "Contributions in external forums (conferences, publications, blogs, open source)".

Monday, January 08, 2007

CYA Trumps Productivity ... Again

I'm beginning to wonder if I'm actually an enterprise architect or a portfolio manager. Of course, I can make a very strong argument for portfolio management being a sub-function of enterprise architecture but, that's another post altogether.

I'm leading a small, very senior team through the process of selecting various technologies to become enterprise standards in the general realm of "temporary and persistent data storage". We actually can't say caches and databases because that would unecessarily spook the herd. Anyhow, the problem begins with the fact that this is a very senior team and each one of us has a diverse work history and the correspondingly large t-shirt collections. This enables us to leverage our experience and move through the evaluation process much more quickly. The fact that we have very crisp requirements helps a lot, too. The second aspect of the problem is that we are using SCRUM to manage the project within some fairly short sprints. The work involves research, disposable prototypes and POCs.

Last week, we ran into some problems with regards to progress and status reporting. In the interest of brevity, we decided to condense each finding, outcome and recommendation as a line item for a bullet list. After several iterations of charts, graphs, tables and nested lists, we still had too much information. The granularity of information shrunk from "upgrade database cluster X hardware and add a hardware load balancer" to "upgrade database hardware". After several more iterations, we finally managed to extract the real objections. These are quotes from the review meeting:

"This is too much information for a 2-week reporting period."

"These aren't interim results?"

"You guys are trying to solve world hunger!"

After a bit of questioning, it turns out there are several other teams working on different projects using similar approaches. The main point of contention gradually emerged: this team was completing almost twice the amount of story points as other, larger teams. That meant, the other teams' progress was much easier to track and understand because there were less work products to review.

We were actually told to "increase your transparency" and "adjust your delivery schedule to accomodate the other teams." We were also asked to create a "responsibility traceability matrix"(TM) so we could track the handoff of our findings/recommendations as they were incorporated into projects and finished. Now, there's all kinds of items that can be extracted from between the lines and I agree with most of it. There you have it, folks: Traceability and accountability are more important than productivity, no matter the size of the project or the demands of the business.