Friday, July 28, 2017

Software Engineering Maxim #16: Spreading knowledge makes it thicker

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

Pigeonholing is rampant in software engineering, engineers who have become experts in a particular area and always end up being assigned tasks in that area.

There are occasions where that is optimal, where becoming a subject matter expert takes substantial time and effort, but these situations are rare. In most cases it is not the expense of becoming an expert that keeps an engineer doing similar work over and over, it is just complacency.

Areas of the product where the team needs to continue to expend effort over a long time period should move around to different members of the team. Multiple people familiar with an area will reinforce each other. Additionally, teaching the next person is a very effective way to get a better understanding for oneself.


Wednesday, July 26, 2017

Software Engineering Maxim #15: Sometimes the hard way is the right way

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

For any given task, there is often some person somewhere who has done it before and knows exactly what to do and how to do it. For things which are likely to be done once in a project and never repeated, relying on that person (either to do it or to provide exactly what to do step by step) can significantly speed things up.

For things which are likely to be repeated, or refined, or iterated upon, it can be better to not rely on that one expert. Learning by doing results in a much deeper understanding than just following directions. For an area which is core to the product or will be extended upon repeatedly, the deeper understanding is valuable, and is worth acquiring even if it takes longer.


Monday, July 24, 2017

Software Engineering Maxim #14: Churn wastes more than time

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

Project plans change. It happens.

When plans change too often, or when a crucial plan is suddenly cancelled and destaffed, we burn more than just the time which was spent on the old plan. We burn confidence in the next plan. People don’t commit as readily and don’t put their best effort into it until they’re pretty sure the new plan will stick.

In the worst case, this becomes self-reinforcing. New plans fail because of the lack of effort engendered by the failure of previous plans.


Friday, July 21, 2017

Software Engineering Maxim #13: Cadence trumps mechanism

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

We tend to focus a lot on mechanisms in software engineering as a way to increase velocity or productivity. We reduce the friction of releasing software, or we automate something, and we expect that this will result in more of the activity which we want to optimize.

Inertia is a powerful thing. A product at rest will tend to stay at rest, a product in motion will tend to stay in motion. The best way to release a bunch of software is to release a bunch of software, by setting a cadence and sticking to it. People get used to a cadence and it becomes self-reinforcing. Making something easier may or may not result in better velocity, making it more regular almost always does.


Wednesday, July 19, 2017

Software Engineering Maxim #12: No postmortem prior to mortem

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

There are going to be emergencies. It happens, despite our best efforts to anticipate risks. When it happens, we go into damage control mode to resolve it.

People not involved in handling the emergency will begin to ask about a postmortem almost immediately, even before the problem is resolved. It is important to not begin writing the postmortem until the problem has been mitigated. Doing so turns a unified crisis response into a hotbed of fingerpointing and intrigue. Even in a culture of blameless postmortems, it is difficult to avoid the harmful effects of the hints of blame while writing that blameless postmortem.

It is fine, even crucial, to save information for later drafting of the postmortem. IRC logs, lists of bugs/CLs/etc, will all be needed eventually. Just don’t start a draft of a postmortem while still antemortem.


Monday, July 17, 2017

Software Engineering Maxim #11: Don’t shoot the monitoring

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

There is a peculiar dynamic when systems contain a mix of modules with very good monitoring along with modules with very poor monitoring; the modules with good monitoring report all of the errors.

The peculiarity becomes damaging if the result is to have all of the bugs filed against the components with good monitoring. It makes it look like those modules are full of bugs, when the reality is likely the opposite.


Friday, July 14, 2017

Software Engineering Maxim #10: Risk is multiplicative

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

There is a school of thought that when there are multiple large projects going on, and there is some relation between them, that they should be tied together and made dependent upon each other. The arguments for doing so are often:

  • "We’re going to pay careful attention to those projects, making them one project means we’ll be able to track them more effectively."
  • "There was going to be duplication of effort, we can share implementation of the common systems."
  • "We can better manage the team if we have more people available to be redirected to the pieces which need more help."

The trouble with this is that it glosses over the fundamental decision being made: nothing can ship until all of it ships. Combining risks makes a single, bigger risk out of the multiple smaller risks.


Wednesday, July 12, 2017

Software Engineering Maxim #9: Evolve systems as a series of incremental changes

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

There is substantial value in code which has seen action in the field. It contains a series of small and large decisions, fixes, and responses which made the system better over time. Generally these decisions are not recorded as a list of lessons learned to be applied to a rewrite or to the next system.

Whenever possible, systems should evolve as a series of incremental changes to take it from where it is to where we want it to be. Doing this incrementally has several advantages:

  • benefits are delivered to customers much earlier, as the earliest pieces to be completed don’t have to wait for the later pieces before deployment.
  • there is no stagnant period in the field after work on the old system is stopped but before the new system is ready.
  • once the system is close enough to where we want it to be that other stuff moves higher on the list of priorities, we can stop. We don’t have to push on to finish rewriting all of it.

Monday, July 10, 2017

Software Engineering Maxim #8: Money is not the only motivator

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

A monetary bonus is one tool available for managers to reward good work. It is not the only tool, and is not necessarily the best tool for all situations.

For example, to encourage SWEs to write automated system tests we created the Yakthulhu of Testing. It is a tiny Cthulhu made from the hair of shaved yaks (*). A Yakthulhu can be obtained by checking in one’s first automated test to run against the product.

(*) It really is made from yak hair. Yak hair yarn is a thing which one can buy. Disappointingly though, they do not shave the yaks. They comb the yaks.


Friday, July 7, 2017

Software Engineering Maxim #7: The service is the product

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

The product is not the code. The product is not the specific feature we launched last week, nor the big thing we’re working on launching next week.

The product is the Service. The Product which customers care about is that they can get on the Internet and do what they need to do, that they can turn on the TV and have it work, that they can make phone calls, whatever it is they set out to do.


Wednesday, July 5, 2017

Software Engineering Maxim #6: It’s a marathon, not a sprint

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

"Launch and iterate" is a catchy phrase, but often turns into excuses to launch something sub-par and then never getting around to improving it.

Yet there is a real advantage to being in a market for the long term, launching a product and improving it. Customer happiness is earned over time, not all at once with a big launch.

  • This means structuring teams for sustained effort, not big product pushes.
  • It means triaging bugs: not everything will get fixed right away, but should at least be looked at to assess relative priority.
  • It means really thinking about how to support the product in the field.
  • It means not running projects in a way which burn people out.

Monday, July 3, 2017

Software Engineering Maxim #5: There is an ideal rate of breakage: no more, no less

(This is one of a series of Software Engineering Maxims Which May or May Not Be True, developed over the last few years of working at Google. Your mileage may vary. Use only as directed. Past performance is not a predictor of future results. Etc.)

Breaking things too often is a sign of trying to do too much too quickly, and either excessively dividing attention or not allowing time for proper care to be taken.

Not breaking things often enough is a sign of the opposite problem: not pushing hard enough.

I’m sure it is theoretically possible for a team to move at an absolutely optimal speed such that they maximize their results without ever breaking anything, but I’ve no idea how to achieve it. The next best thing is to strive for just the right amount of breakage: not too much, not too little.