The Standard Engineer

I came across a framework from a podcast in 2019¹. It blew my mind at how well it translated into the domain of software engineering. I was also very much into it for the potential it had to solve a problem with self-assessment. If you’ve ever had to design rubrics for any scheme of work you’ll appreciate how hard it is to ensure you’re structuring rubrics for development and to avoid becoming a wish list for an idealised human.

Self-assessment

Teaching is something I’ve done as a job over a few separate domains, but more importantly it is something I’ve baked into my leadership practice. There’s nearly always basics that need to be taught. But when things go beyond basics you dial down the teaching and ramp up the facilitation. The role of the teacher at that point is in using assessment of the response to facilitation as a means to propel the student on their learning journey. My tool of choice I always go to is self-assessment² because it also has the nice side-effect of introducing the learner to the idea of reflection as a tool for professional development³. It lets people develop their ability to “think on their feet” which is a boon to any software engineer.

Self-assessment isn’t a magic bullet. Panadero, Jonsson, and Botella’s 2017 meta-analysis showed that the success of self-assessment is in how it is implemented and more specifically around self-monitoring. That’s the rub. Self-monitoring. The problem with self-assessment is that there are three camps of people.

The first are very rare and I’m not sure they exist outside of theory or Buddist monasteries. They are neither overly confident or overly critical . They can neutrally assess themselves. I’ve never met anyone yet who can do this.

The second and third types are either overly critical (me) or overly confident. To use trendy terms they either suffer with imposter syndrome and very loud inner critics (again, me) or suffer from the Dunning-Kruger effect in which case they over-inflate the boundary of their competence as a mask for their (unconscious) incompetence.

Performance Reviews

Enter stage left the annual performance reviews. In an ideal world there is no annual performance review, but that’s another topic for another day.

At their heart performance reviews combine two very distinct and very complex domains. One, assessment, and two, survey design⁴. I’m going to frame this and be a little snarky, yet mostly charitable but try to assume good faith. Performance reviews are well-intentioned metrics devised by people who often have zero training/experience on assessment or designing surveys. They aren’t aware of how complex each of the two domains are but the intention is pragmatic and perhaps noble: you need to ensure that the money you’re paying to people is being returned by value. I get that. But what I do feel is that engineering teams are different. I don’t think the global HR driven performance review approach works and neither do I think that KPI driven reviews work either.

Don’t get me wrong here, I think commercial focused roles work fine for the above flavours of review because once you strip away the questions there’s a common metric: revenue. You either did meet your targets or you didn’t. Any survey or KPI stats included in the review are ultimately packaging for the metric revenue.

For engineers, designers and all code facing IC roles though there is no such ultimate metric and no, it isn’t tokenmaxxing. There’s no proxy to a foundational-question-answerer such as “do you earn the company more than we pay you and if so, by how much?”. This idea of surfacing metrics was explored in Accelerate through DORA and but the important distinction is that it’s applied at the team level and not the IC level⁵. Sadly for exec types they need/want to know about things at the IC level. Predicament.

If I could wave a magic wand I’d pick two metrics for engineers. One, how much have you invested in yourself in the last ninety days? and two, where is that evident in your work? Those two things are invincible. Absence demonstrates a period of maintenance (fine, we all need to rest), regression and neglect is obvious (time to step up on the leadership) whereas improvement signals an answer to the question around “Are you worth paying? and if so, are you getting enough?”.

Picture it:

Alice was upskilling everyone on that new database approach and judging by development and staging it looks like that work is going to save £X. Everyone is behind it as it’s easier to manage over time and solves problems A, B and C with minimal tradeoff. Hopefully you, the leader, in conversation with your manager.

Invisible team members, visible team members.

Lets say you have two people.

One is organisationally invisible to commercial teams. They are profoundly loyal, show up and get it done. They understand the organisation, they stay on top of the latest tech and everyone in the engineering teams gels with them. But they also aren’t very visible to the organisation in terms of impact because they are commercially invisible, except in terms of stability. Sadly, no one notices stability until it isn’t there and you’re breaking SLAs and having to offer “make goods”. Stability as a feature is invisible⁶.

The other person is organisationally visible to commercial teams. They’re implementing features left right and centre to get deals done, but they leave bugs for other people to fix and security holes waiting to be exploited. Commercial people notice them because they solve todays problems at the expense of tomorrow. The debt is invisible because it’s someone else’s problem.

In performance reviews it has been my experience without intervention the invisible team member type comes off worse and the visible type comes off best. Not only is that an inverse result but there’s a temptation that the organisation begins to optimise for the visible engineer type. We know how that story turns out. Technical debt is never paid off. Curiously, and another jumping off point into another topic is CV’s and job applications. The visible types tend to have “more inviting

Theres’ no easy answer to the situation and it is this tension that is a cause of the “commercial” vs “systems” tension that often exists. Obviously here, for effect, I’ve framed invisible types as entirely good and visible types as mostly evil. This framing doesn’t stand up to reality. The world is complex. My framing is illustrative to make a point.

GIGO

The invisible engineers who used to keep things stable get over-worked, they leave. Not long after the visible types start to figure out they’re going to have to clean up some mess. Some do and become awesome engineers but most don’t. They leave. It is the equivalent of honey fungus (Armillaria mellea) establishing itself into the cultural ecosystem. The metaphorical engineering trees are blown over by the winds of commercial activity. Where a thriving woodland of engineering once stood, there is now a giant mess of dangerous partially fallen systems that require thrill-seeking to unpick and make the area safe and operational again.

I am yet to see or hear of any company brave enough to challenge the results or discard the kind of performance reviews that come out of “HR software” or “KPI-driven-executive-conversions-of-strategy-to-tactical-operational-matters”. So the results stand and are acted upon. An act which I find diabolical in nature and one of the reasons that I’ve always advocated for having teachers on staff, but that’s another one for another day.

In an engineering context the final insult is that the results are typically used to decide who is to be praised with “more gold” and who is to be punished with “performance enhancement plans” or even worse: let go. It’s culture rot and it’s awful but as a leader, you may have influence. One time, I did so. When I got the chance I used the framework.

The framework.

Finally the framework. Thanks for staying with me.

If you’re running an engineering team and you’re allowing people to self-assess then you have the two types of people I’ve mentioned in this post. The critical types tend to score low and the overly confident types score high. That data is worthless when combined because it isn’t normalised. All you’d be submitting is the fact you have visible vs invisible people on your team. Pointless at best, dangerous at worst. Whilst I’d prefer there was no annual performance metrics, I’m not sure that world exists. So whilst it’s regrettable, let’s not focus on what we cannot solve and instead focus on what we can. Thats where the standard framework comes into play.

Five Domains

The framework has five domains:

Professional knowledge: technical knowledge and practical application.
Quality of work: standard of work; value of end product.
Initiative: responsibility, quality & quantity of work.
Teamwork: contributions to team building and team results.
Leadership: organising, motivating and developing others to accomplish goals.

Each of the domains is scored from one to five. Criteria are given for one, three and five. Two and four are blank. You’re either moving toward (good) or from a category (bad). The general idea is that the rubrics are there to inform conversation and allow the person who is self-assessing to have concrete facts to anchor their introspection to. Each section has three to five areas. Everyone scores themselves on each section and divides the answer by eighteen. That’s your score. If your score is above three, nicely done. Let’s work on your scores above three as far as you’re willing, as these are areas where you could excel by capitalising on your natural abilities. The things below a three, that’s where the work is. Get them on track to a three and you’re going to find the world opens up in new and exciting ways.

Professional knowledge

Category	Below	Standard	Above
PROFESSIONAL KNOWLEDGE Technical knowledge and practical application.	Marginal knowledge of professional domain, the organisation or the role.	Strong working knowledge of professional domain, the organisation and the role.	Recognised expert, sought out by all for technical knowledge.
	Unable to apply knowledge to solve routine operational problems.	Reliably applies knowledge to accomplish tasks and solve operational problems.	Uses knowledge to solve complex technical problems quickly and to a high quality.
	Fails to stay on top of professional knowledge or industry developments.	Remains up to date with professional knowledge and industry development.	Constantly up to date with professional knowledge and industry development ahead of time.

Here’s an example, the name has been changed and is paraphrased but it was real.

“I’m a five. I’m exceptional”.

Me: “Ok, let’s do this. let’s start off with professional knowledge. describe to me where you think you’re at”

Alice: “I’m a five.”

Me: “Cool, let’s talk about that. Can you give me an example of where you’re a recognised expert sought out by all for technical knowledge?”

Alice: “Commercial teams are always coming to me to ask how to do things.”

Me: “Sure, but recognised expert indicates outside the boundary of the company. Industry level, have you given any keynote speeches or presented at a global conference? I’m not trying to de-value your contributions, but your example feels more like the three ‘Strong working knowledge of professional domain, the organisation and the role’. Is that a fair categorisation?”

Alice: “Yeah but I’m at least a four, I’m above average.”

Me: “ok, do you maintain a blog? do you give any talks at local events to indicate that’s where you’re headed. Or any other examples that indicate movement towards a five on a industry level?”

Alice: “No, not really, but I’d like to be.”

Me: “great, give yourself a score for this.”

They scored three and that’s fine. Three’s are rock solid. I’ll take a team of three every day. If you read through the rest of the framework you’ll see why. They later did go onto start doing talks at local events and then they rightfully claimed the four. The thousand dollar question is would they have started doing talks without this little reality check? I will never know. But that’s often the case with leading; you’re just the wind and a suggestion to hoist a sail. The action is down to the person.

I don’t feel I need to labour with a critic type example but suffice to say it pulls critic types up and also orientates them towards self-development. I’m a big fan of hard truths and hard conversations delivered with as much compassion and contextual sensitivity as can be mustered. Ultimately though, the band-aid is coming off. But by working through each of the sections you have a means for deeply powerful self-introspection. It’s as free of bias as you’re likely to get because everyone works from the same rubric and they’re self-assessing. If the people you lead are willing, it’s a very powerful set of foundations on which to progress.

Ok, let’s tear through the other sections of the framework. As an exercise for you, dear reader, focus on the standard column. Notice how absolutely solid the standard is.

Quality of work

Category	Below	Standard	Above
QUALITY OF WORK Standard of work; value of end product.	Needs excessive supervision.	Needs little supervision.	Needs no supervision.
	Work frequently needs reworking and creates production issues.	Produces quality work. Few errors or resulting rework and maintains production performance.	Always produces exceptional work. No rework required and often includes production performance enhancements.
	Wasteful of resources and time.	Uses resources and time efficiently.	Maximises resources and time.

Initiative

Category	Below	Standard	Above
INITIATIVE Responsibility, quantity of work.	Needs chasing to complete work or admin.	Productive and motivated. Completes work and admin fully and on time.	Energetic self-starter. Completes work far better than expected and admin with exceptional clarity.
	Prioritises poorly, fails to manage self.	Plans and prioritises effectively. Adequately manages self.	Plans/prioritises wisely, with exceptional foresight and self-management.
	Avoids responsibility.	Reliable, dependable, willingly accepts responsibility.	Seeks extra responsibility and takes on the hardest jobs.

Teamwork

Category	Below	Standard	Above
TEAMWORK Contributions to team building and team results.	Creates conflict, unwilling to collaborate with others, puts self above the team or does not support teams’ work.	Reinforces others’ efforts, and does what is required to meet team objectives.	Team builder, inspires cooperation, progress and goes beyond requirements to assist the team’s objectives.
	Fails to implement or deliver on team objectives.	Implements and delivers on team objectives.	Implements and delivers on team objectives and is effective at aiding others on their delivery.
	Does not take direction well.	Accepts and offers team direction and feedback.	The best at accepting and offering team direction.

Leadership

Not everyone is going to, or wants to, become a leader. Having said that, everyone has some leadership responsibilities even if they’re limited to being led. Leadership is a two way, active pursuit.

Category	Below	Standard	Above
LEADERSHIP Organizing, motivating and developing others to accomplish goals.	Neglects growth or development of teammates.	Effectively stimulates growth and development in teammates.	Inspiring motivator and trainer, teammates reach the highest levels of growth and development.
	Fails to organise, creates problems for the team. Does not contribute to process improvements.	Organizes successfully, suggests process improvements and efficiencies.	Superb organizer, great foresight, suggests and sees through process improvements and efficiencies.
	Does not set or achieve realistic goals or estimates, which hinders the organisations mission or vision.	Sets/achieve useful, realistic goals and estimates that supports the organisations mission and vision.	Estimates and goals are accurate and predictable. They dramatically further the organisations mission or vision.
	Lacks ability to cope with or tolerate stress.	Performs well in stressful situations.	Perseveres through the toughest challenges and inspires others.
	Inadequate communicator that is overly verbose, requires chasing or clarity in their communication.	Clear, timely and succinct communication to the lead and their team.	Exceptional communicator to the team, lead and the entire organisation.
	Tolerates and practices unsafe working practices, rarely follows legislation and is below industry codes of practice.	Demonstrates safe working practices and follows legislation and industry guidance and codes of practice in nearly all situations.	Upskills teammates with safe working practices, is exemplary in and following legislation and industry guidance and codes of practice.

You are not a number

You are you. A uniquely intangibly brilliant, deeply complex human being who is positioned somewhere between being a visible engineer or an invisible engineer. Your location is entirely of your choosing. You are probably however locked into being a self-critic type or an overly confident type. Maybe there’s room to move the needle on that, but without/until a tonne of meditation, you are who you are. I’m a critical type and whilst I wrestle with my inner critic (sometimes to the point of complete mental shutdown)⁷, I wouldn’t like to see “the auditor⁸” leave my psyche. As I’ve learned through meditation, he’s useful.

Whilst I’m fairly sure this framework in its original context was used to reduce people to numbers, I’m not into that at all. When I used it, the idea was that it was always about self-assessment. Yes, results were fed back up to HR but I’d rather that than the obligatory KPI drive results.

1:1’s and orientation

As stated at the beginning of this post self-assessment can be a very powerful tool² when used correctly⁹. One of the best uses of self-assessment is in reflective practice. It also happens to lend itself very well to a formal aspect for 1:1’s because all professional domains benefit from reflective practice. To be clear, I liked to keep my 1:1’s with my reports conversational. It was their time with me before it was my time with them. So I never used to go over each section like a banal doctor would in a boring consult. Rather it was done focused on the needles and dials we were trying to move over the course of time and how I could best support that.

Once you’re out of the basics of a subject learning ceases to be driven by your teacher or lead and it moves to you. Also, the surface area of skills in the engineering domain is massive. In that near infinite expanse of a landscape if you don’t have a map to guide you (the rubric) then you’re going to get lost, disorientated and quickly fall back to whatever survival habits are your defaults. I think this maps onto the phrase “you don’t have X years of experience, you have Y years of experience X times”. A framework like this keeps you firmly tracking onto the much more fruitful path of X years of experience ONCE.

The framework is very useful for that but remember that you are not a number.

But, your number can be useful

I’m a reflective person, the auditor⁸ makes sure of that. I listen to a lot of things and I read a lot of things. I start a lot of projects in bits and bytes and in the real world. I don’t finish everything I start because I don’t think everything that is started needs to be finished, at least not synchronously. For me, that’s where my number comes in. Is the thing I’m doing contributing to me moving towards exceptional?

case {direction, context} do
  {:forward, :personal} -> {:ok, [:finished, :bored]}
  {:forward, :professional} ->  {:ok, :finished}
  {:static, _} ->  {:ok, :maintain}
  _ -> {:error, "Life is precious, use time wisely"}
end

This framework is useful because of its utility in multiple scenarios. Personal growth, professional development, inner critic doom spirals to name a few. But the best thing is how it normalises over the entire race of humans. I cannot see a world in which I hit five in more than a few of the rubrics. Three maybe four of them, but certainly not all eighteen. But I don’t stop trying. Neither does it change the fact that this world is full of exceptional human beings. It’s the act of self-improvement that’s important.

If I’m ever leading you and you don’t like this framework, that’s cool. Let’s use one that works for you. The important thing is we can agree on what standard means and setting standard as the goal. In a world full of humans, there’s the very real possibility that standard is a goal not a default.

Before I wrap this up, I want to be clear on one thing. I don’t feel like being sub-standard as a starting point is a blemish or a negative. However, it’s a death signal to HR types and some professionals who share some traits with ostrich. To that end, on a personal level, being substandard is fine. However, I may have allowed certain fudging to go on before these numbers went up to HR.

Anyway, regardless of whatever framework is used, the idea of standard being both solid and the goal - that’s why I’m always aiming to be a standard, well rounded human being.

There’s glory in being dependable, stable and consistent over time in personal and professional improvement.

🐢 vs 🐇

Thanks for reading. I appreciate your attention.

Cheers,

Jamie.

Jocko Willink podcast back in 2019. It’s actually from the US Marine Corp. Don’t let that shadow the usefulness of the framework. ↩
Black and Wiliam’s “Inside the Black Box” (1998) and their larger review “Assessment and Classroom Learning” are the most cited starting points for support for self-assessment. ↩ ↩²
Donald Schön’s “Educating the Reflective Practitioner” argues that professional education should move away from strict technical theory and focus on developing a practitioner’s ability to “think on their feet”. ↩
Accelerate, one of my favourite tech leadership books devotes an entire half of their book to this. Think about that. Half a book to survey design. ↩
DORA and SPACE and DevX will be something I explore as I get through my backlog of things that I want to write. ↩
Unless you’re post-Microsoft GitHub or an AI provider. ↩
When it happens, I just go to the woodland and get after it. ↩
My inner critic is called the auditor. We have a working relationship. I don’t like him and clearly he hates me. ↩ ↩²
But it has to be used correctly as discovered by the Panadero, Jonsson, and Botella’s 2017 meta-analysis ↩