Covalent Data, first impressions

Covalent Data is a tool to search over research topics. It seems to have the following features:

  • Is a DB of grants, papers, people and institutions.
  • claims to use machine learning to tie these entities together.
  • search results don’t seem to have a way to be exported, so for example though the grant awards results do list the amount of each grant, to get a total amount against a search term, you would need to do the work of extracting each data point manually.
  • is hard to determine what their sources are, specifically what data bases are not being covered?
  • for multi word searches the engine prefers to return “near results” than exact matches, it was quite fiddly to force exact matching in the search interface, and I’m not convinced I was able to actually get it to work.

Some example searches:

search term “Digital Humanities”. 16 results found
All results are from NSF.

search term “computational social science”. 31 grants
30 publications
Again, all NSF funded grants

“social science” + “big data”. 90 grants. 24 publications

search term “stem cells”. 76k results.

In contrast The NSF grant search tool does allow a download of results, it found the following:

search term “computational social science”. 13 grants

search term “digital humanities”. 23 grants.

search term “stem cells”. 632 grants.

I was puzzeled as to how covalent data found so many more grants for stem cells, when both NSF and NIH reported far fewer.

Overall I’d like to see more grant agency coverage, more clarity around how results are generated, and an export ability. I could see covalent data becoming a useful tool at some point in the future, expecially if I felt that it was taking away the pain of having to go to many different sources to find grant funding information. Right now I’m not sure I trust it, and the results, as returned, are a bit hard to work with for onward analysis.

Goodbye eLife!

# Goodbye eLife!

So after nearly four and a half years I am moving on from eLife. I’ve had an amazing time, worked with some amazing people, and we have gotten a few really nice things done.

First off, we are hiring a replacement for my role, this is an amazing opportunity to effect real change in scholarly publishing. eLife has just announced follow on funding of £25M to sustain us through to 2022. We have a great dev team, we have the buy-in from a hugely respected editorial board, and our submissions are going from strength to strength. Being open, and making the software we build open source is baked in to our culture. If you are excited by the possibilities, then do think of applying!

Secondly I’d like to cover the reason why I’m leaving. Eight weeks ago my wife delivered our beautiful little daughter Laira, and now we are jugging two little ones at home. I live in London and commute most days up to Cambridge to work at eLife. Though I was not in the market for a new job, when an opportunity came up that was just around the corner from where I live, it was something I had to think seriously about. After thinking deeply about it, I made the decision that this new opportunity was both sufficiently exciting and could give me the ability to support my family in a way that is just not possible with a commute to Cambridge as part of my daily routine. I am a strong believer in putting family first in these decisions. Life is not a rehearsal for some time when we will get to do it all again in the future, but better. It’s also not lost on me that when I moved from Mendeley to eLife, that was driven by the upcoming birth of our son. I have been extraordinarily lucky to have had such great opportunities that allow me to support my family in this way, while at the same time allow me to pursue work that is exiting and impactful.

Finally I’d like to look back over my time at eLife and give a personal reflection on what we have achieved. If I count it correctly I think I was the eighth person to join the team (not including our amazing editor in chief Randy). At the time of joining I wrote

"With eLife I'm convinced there is an opportunity to make a contribution and an impact too. It's in front of us now, and we have the opportunity to do something great."

Well four years on, I think we have done something great. When I joined eLife was literally a blank sheet of paper. One of the first things we did, even before launching our journal platform, was to start attracting submissions, and when they had been submitted posting them directly to PubMedCentral with no delay. At the time this caused fury in some areas of the publishing world, but it was the right thing to do for the researchers, and for science, and I think that kind of set out our marker that we were not simply interested in doing things the same old way.

Those manuscripts went through the eLife peer review process, a collaborative process that almost totally eliminates the 3rd reviewer problem.. There are not a huge number of innovations happening in peer review at the moment, so I think what eLife has done here is really laudable, and it’s great to see if getting traction in some other journals.

Of course I was brought on board to deal with the technical development of the journal, and in December of 2012, with Highwire as a partner, we launched the eLife journal website on an incredibly ambitious timescale. At the time it was widely lauded as having the clearest layout for a scholarly article page of any journal on the web. We brought videos inline, made a good effort of no longer abandoning supplementary materials to the ghetto of the page footer, and made it far easier to see related images than had ever been done before. A lot of that was due to working with a design agency that had no previous experience in scholarly space – http://ripe.com, and tackling the problem as a straight up UX problem, rather than a problem specific to research. Hand in hand with that went a focus on getting as much value as possible into the XML, including funding information and well structured contribution information for authors. We have posted our sample XML into github, and we push the XML of all of our journal articles into github, along with some nice tools for parsing them.

One of the most fun products that we worked on was eLife Lens. We did this in collaboration with Ivan Grubisic who had the idea, and the http://substance.io team, who added a great deal of coding and thinking muscle to the project. Lens has gone on to be adopted by a number of other publishers, and I’m exited about it’s future.

Another great highlight of my time at eLife was when Randy won the Nobel prize in 2013. Amazingly we were one of the first people he called, even before the news had been press released. The traffic to the journal got a nice spike that week.

Earlier this year we hit another big milestone, and we took over full hosting of our own content, along with developing the production system behind the scenes that powered that. We are going to open source eLife Continuum in the next couple of weeks, and that’s one of my big remaining jobs to get done before I move on.

All of this technical development happens in support of science, and the development teams’ mission in eLife is

	To build a platform for research communication that embodies the best practices of open development and that treats its users with respect. 

It’s been incredible to see the support that eLife has from the scientific community. We are now getting well over 600 submissions per month, and the quality of the research that we are publishing is fantastic, from editing RNA, to work on the Zika virus right through to the discovery of new species of Homo. The papers we publish make their data open, and their reviews open too. It’s just fantastic to see the scholarly community embracing such transparency, and it’s making a real impact on improving the way science is done.

On a more personal level, all of these things were achieved through the dedication and hard work of a large number of people. I had the great pleasure of working with an amazing team, and also working with some amazing partners over the last four years. When I was evaluating joining eLife I got in touch with a mutual friend who had worked for my boss Mark. He said that Mark was the best boss he had ever had, and if I ever got a chance to work for him, I should jump at that chance. Well, jump I did, and I can very gladly report the same. I’ve learnt so much working for Mark, and I hope to carry some of that with me in my future career.

The technical team that we have built up over the last four years are amazing, and I’ll definitely miss working with David, Nathan, Sian, Chris, Luke and Giorgio. I’ll miss the great interactions with the planning team, the support from eLife, and the general sense of camaraderie within the office.

I am so incredibly proud and humbled to have been a part of this initial journey for eLife. Knowing what I do about what is coming down the line, I’m just really excited for the future of the journal and the continued future impact that eLife is going to have. I feel that it couldn’t be in a stronger position right now, and I wish whoever comes in to shape the role I’m leaving as much fun and enjoyment as I’ve had over the past four years.

A(peeling) Peer Review, a proposal.

eLife’s peer review process is really good. One of the key attributes of this is that reviewers are not blind to one another, and they have to consult with one another. This largely removes the third reviewer problem. We also publish the decision letters and the author responses to the decision letter.

Reviewers have the option of revealing themselves to authors. As with most review systms our reviewers know who the authors are. We are not at the point where our review process is fully open, this is the kind of thing that is community driven. My own hope is that we can move towards fully open review in time.

Even in fully open review, where there is no blinding between authors and reviewrs, I think there is a case to be made for making the reviewers blind to whom the authors are. They will find out eventually, when the paper is published.

You can argue that this is pointless because in a small field everyone knows who everyone else is anyway, indeed the evidence on small scale studies is mixed with some evidence in favor, and some evidience against the thesis that this masking will help improve the quality of review.

With the growth of reserach in the BRICS nations there are increasing numbers of papers coming in from labs that might not be that well known, and that might suffer from this potential bias. Researchers from these nations certainly fear this kind of bias and when you construct teh study in a ceratin way there is some evidence to support this feeling.

There have been a few case studies which I am unable to dig up at the moment of writing this post, but gist of what they did was that they took a selection of already published papers, and resubmitted them with author names and institute names replaced by those that would apprear to be from less prestidgious labs and countries. Most of the papers thus re-submitted, were rejected.

There seems to be no evidicne that I’m aware of that suggetss that this blinding decreases the quality of review, or increases biases in review.

So if we do introduce blinding the authors from the reviews at the review stage, it’s not likely to hurt, it is likeley to increase the feeling of confidence in the system from BRICS researchers.

What would be great though. is if after this bliding, and after the publicaion, if at that point we could reveal all of the idntities of those involved, with everyone knowing that this was going to happen up front. We could, so to say, peer away the anonymity of the review process, layer by layer. We might have an appealing model of peer review, and one in which the incidence of appeals was reduced and the eventualy transparency could lead to better decisions.

So that’s my proposal for a review system, one in which we peel back our layers of shielding at the end.

It may well be that this is already happening, I don’t know journals that are doing it exactly like this off of the top of my head, do let me know!

data science vs statistics talk at the RSS

Tonight I attend a debate held at the Royal Statistics Society. It was en entertaining, and wide ranging discussion with a great panel, and fairly intelligent questions from the floor.

Two fun facts:

  • One of the founding members of the society was Charles Babbage
  • The society is located within spitting distance of the grave of William Thomas Bayes

So, here are the notes, rough and ready, and I missed a lot in my note taking, but I hope I get a flavour of the discussion across.

Data Science and Statistics: different worlds?

Introduction

Data since needs to define itself, perhaps in order to survive. There are a shortage of people in these roles, the feeling is that statistics and computer science are both core skills.

We also need to develop some new methodologies, some aspects of statistics need to change when you work at scale. We need an automated and large scale approach to the problem.

Every company knows that they need a data scientist, but many of them don’t know what skills a data scientist needs.

Chris Wiggins (Chief Data Scientist, New York Times)

Chris has a nice story about the moment that biology got a big interest in data. Biology become awash in data as a result of genetic sequencing. Many fields are now experiencing similar issues. At the NYT he has been building a data science group to help understand how people interact with the NYT product.

Abundant data that is challenging to learn from

They have created a certification in data science at Columbia University.

David Hand (Emeritus Professor of Mathematics, Imperial College)

Was professor of statistics for 25 years at Imperial College, and then got made an emeritus professor of mathematics.

HIs statistical interests overlap topics such as machine learning, data science (ML is clearly a sub-disciple of statistics (tough in cheek)).

Has published rather a lot of books.

Wants to make a few fundamental points,

people don’t want data, they want answers

data aren’t the same as information (e.g. with pixels in digital cameras, cameras keep creating more data, but the information in the picture remains the same)

two key areas of challenges: data manipulation: automatic route finders, recommendations, the challenges there are mathematical and counting. The other kind of challenge relates to inference, what will the data say in the future, not what has happened in the past - that is statistics.

all data sets have problems, data quality is a key issue. It’s even more potentially serious for large data than it is for small data sets, the computer is an intermediary between the person and the data.

the assertion that we no longer need theories; we can just look at the data, can more often than not lead to mistaken conclusions

Big data doesn’t mean the end of small data. There are more small data sets than big data sets.

Francine Bennett (Founder, Mastodon-C)

Founder of Mastadon C. Was a maths undergraduate, pure mathematics, started a PhD, dropped out, became a strategy consultant, got bored drawing powerpoints, moved to a google strategy team, tried to solve problems with tools other than excel.

There is now a niche for companies to apply the tools that have come from online advertising, to non advertising and non banking related problems, e.g. the built environment.

A lot of what they do is a combination of software engineering and data science. Its sometimes used as a putdown, a data scientists is someone who is better at engineering than a statistician, and better at stats than a programmer, that breadth is critical to making things work.

Patrick Wolfe (Professor of Statistics, UCL / Executive Director, UCL Big Data Institute)

He has seen data science from different perspectives, studied electrical engineering and music, did a PhD that combined these, looking at systems to restore old audio recordings, his interests were very statistical in nature. Has always maintained one foot in electrical engineering and one foot in statistics. Is executive director of the the big data institute at UCL.

His personal research interests resolve around networks. His mathematical interests at the moment are about how we understand the structure of large networks.

Thinks that this is a once in a lifetime opportunity for all of us. There is an opportunity for statistics to participate in a dramatic change in how we understand and collect data. The paradigms in how we collect data are clearly changing.

What will it mean to shape the future of data science. We need to create an intellectual core.

That core is related to mathematics, statistics and computer science.

For statistics we must recognise the following paradigm shift: we all learnt about designed small scale experiments, and data was expensive. Now everyone wants to work with found data. It’s the responsibility of statisticians to help people do that. Statisticians have a responsibility to teach people how to draw inferences from found data. They can’t be the community that is always telling people what they can’t do, or how they should have collected the data.

Zoubin Ghahramani (Professor of Machine Learning, University of Cambridge)

Is an accidental statistician. Started out interested in AI and in neuroscience. Couldn’t make his mind up between these two fields, got a PhD in neuroscience, but got interested in making computers adapt to difficult problems.

An old AI paradigm was that you got intelligence in computers when you combine them with data, and where you move towards machine learning you are more successful. At this point these people needed to learn more statistics. His program now covers Bayesian non-parametric solutions (coincidentally, something my father-in-law worked on).

What is data science? He thinks that the answer depends on whether you talk to academics or people in industry.

People in industry have a very different view from the view from academia. People in industry just have problems that they want to solve. Their buzz phrases are big data, business intelligence, predictive analytics, etc. In industry the largest part of the problem is the collection and curation of the data. Then there is a small part of the fun part- the machine learning- then there is a large part of presenting and interpreting the data.

Let’s not overestimate the role of the statistician in this pipeline. There are interesting research challenges in thinking about the whole pipeline.

Academic fields are social constructs, they are networks. This is something we need to overcome, we need fewer barriers between these disciplines and we need more people with these skills.

This is an opportunity, and also a threat to statistics. If statistics does not move quickly enough there are many other disciplines that will want to jump into this space. Talking to other departments setting up cross departmental projects.

Discussion

Point is made that statistics can’t be everything, it can’t get to the point where the definition is too broad.

Statistics has remained a relatively small discipline (I’m not so sure about this). Contrasts stats with electrical engineering. One of the interesting things to watch will be how a small and traditional discipline goes through a growth phase where not everyone will understand what is going on.

The point is made that this happened a while ago with the branching out to things like biometrics - this happened 50 years ago.

The call is made to broaden, not merely shift, the skill sets of statistician.

It’s also made that the level of communication needs work (across the board). If you are told that you are stupid if you don’t know what a t-test is, then you might, as a computer scientists, just choose to try to run a neural-network and do machine learning. It might not be easier in principle as a technique, however the routes to using this technique are easier.

As computation becomes more and more personal, data science is shaping our personal relationships and that is going to draw more people into those fields.

Questions from the floor

More or less - How do you learn to be a data scientist?

The discussion focusses on how it is much easier to learn programming that it is to learn stats. The reasons for that are historical. As a profession statisticians are professionally trained skeptics. There is almost a visceral reaction to seeing people trying a bunch of things out.

The role as a statistician should not be to nay-say but to help people figure out what they can reasonably say with the data that they have.

Observation - questioner is really concerned with what is going to happen with the A-level mathematics curriculum. The good mathematicians in the class are going to have a bad experience with this new course, and get driven away from the field.

A view of statistics from a machine learning point of view is that ML might be a good way of tricking people into learning statistics!

It kind of depends on whether you view statistics from a mathematical, scientific of engineering discipline.

If you are mathematician you want to make rigours statements, if you are a scientist you want to understand the world, and if you are an engineer you want to build things. Historically statistics in some cultures has been thought of as a mathematical discipline with definite, and not always good, consequences. For example in the US to get funding for statistics you have to prove your mathematical worth, however there are many many areas of engineering where you can use statistics to help you build things, and where it is harder to get funded as a result of it’s classification as a mathematical disciple.

The point is made that we need both the adventurous and the conservative. It’s quite important that we retain the ability to be critical (would be nice if we could propagate that ability out to the general public).

It’s also agreed that stats really need to graduate beyond just being a mathematical disciple.

The rise of data journalism is referenced as a source of hope, a way to convert/communicate to non-specialists the power of statistics and data to help understand the world.

Nice question about the polls in the UK ahead of the election, where these polls found data, or designed data?

The answers are not very conclusive.

Point: industry have lots of problems - 90% computing and maybe 10% of stats. People who come from a CS background have a much better chance of succeeding than people who come from a stats background.

The panel discusses. Sometimes recruiters at companies think of machine learning as equivalent to Hadoop or Java. It’s not quite like that. You can gain a basic understanding of these tools, but to go beyond just downloading some software and running it is much harder. There is now demand in the market for people with PhD’s who have years of experience.

As the years go by you will start to see a refinement in the job descriptions, e.g. data dev ops, data engineering, data science.

There is a call to inject algorithmic thinking earlier into the statistical curriculum. (There is a two cultures paper looking at algorithm vs inferential thinking). There is a discussion of the new kinds of trade offs that will need to be navigated with found data. Teaching algorithms, stats and visualisations at the same time can help.

What is the role for a body such as the RSS, what could they most useful do to take this agenda forward?

Attract outsiders. There is a large appetite to learn the knowledge that the RSS have, but the apparent barrier to entry is a bit too high.

It’s as if statistics want to make things as difficult as possible, rather than stating with showing people the joy of statistics, showing people,the joy of discovery.

A controversial point is made, one of the non-statisticians on the panel finds classical statistics very hard, but they find Bayesian statistics very common-sensical, people might find it more intuitive to learn Bayesian statistics first, and you could learn classical statistics later.

It could be great to figure out a way to make academia speed up, and industry slow down (more discussion on this point).

How do you make industry take up what is happening in academia

The view is that there is not a gap here right now.

Question on the science part of the question from an ecologist. When it comes to asking the question of why, how do you get to an understanding of what are the appropriate questions to ask (I think that was a the heart of the question).

It’s suggested that what the service model for this domain is still needs to be worked out.

This whole discussion has been about data technology, we are really talking about engineering rather than the creation of new tools.

Often the gap is in data science talking to the business. Especially if the data disagrees with the opinion of an important person in the business.

You can teach students how to communicate by getting them to participate in a few different kinds of projects with different types of audiences.

It’s a strong tradition in statistics to work with someone from a different domain (e.g. natural scientists). Training people to help with the creation of experimental design.

Data science has the opportunity to broker between the people who have the questions and the data that they have to deal with that data.

John Pullinger - UK National Statistician - summary of the discussion

What they want to do as a society is to wellcome everyone.

As the debate went on John was thinking that the RSS was started when old fields were faced with new techniques. One of the founders of the society was Charles Babbage. Another early member was Florence Nightingale. The statistical profession that John knows are mainly in the business of making the world a better place.

Todays revolution is a once in a lifetime opportunity.

How do you help with the demand side, how do you help to educate people to make use of these tools. We need to educate the problem solvers.

How do you help to solve the supply side. John picked up four themes

There is a lot of data, it’s just starting, its going to ramp up very fast. But you have to care about what it is and where it comes from. We have to care about method. There is a new methodology emerging at the boundary of these areas. That deep thought will come from our universities. Technology is driving change. Finally we need skills, the back to the future skills set - we really have to grasp how we teach people these skills.

What are these skills? Some of them a new, but some of them are old standards.

The defence of data skeptisiim is important, but you can’t be a naysayer, it’s about curiosity. This data skepticism has to be at the heart of this toolset.

Bias is also core, every dataset has bias, understanding this bias is where you get creativity form stats and computer scientists.

Uncertainty is also fundamental.

Discerning which patters have some meaning in contrast to the random fluctuations of the universe.

aws london summit notes

Amazon Web Summit London 2015

## Keynote

There were about three thousand people at the summit. I chatted to a few people throughout the day. Their experience with AWS ranged from moderate use through to just being at the evaluation stage.

The keynote highlighted AWS’s approach of wanting to put the customer in control, and to remove all unnecessary work from the customer in terms of managing IT.

AWS has grown enormously, they are estimated to have five times the compute power on hand than all other cloud providers combined. They have over one million active customers, and many of their services have more than doubled in terms of usage in just the last year alone.

Most of the keynote was given over to hammering home this message and to having companies that use AWS services come up and talk about their usage.

There were two products discussed in the keynote that piqued my interest.

Elastic File Storage

AWS now offers an NFS service. When hearing this my thought is that this might be able to replace our NFS box in the office. I went to a session on this later in the day, and it needs a bit more investigation.

Amazon Machine Learning

AWS now provides a simple console for creating machine learning models. What is nice about the service is that once you have trained your model you can choose to either put it into a data processing pipeline, or you can create a REST endpoint that can act as a decision endpoint.

We don’t have any use cases for this at eLife right now, but there are somethings that might be fun to experiment with:

Picking papers that need an insight

We have a collection of papers that received insights, and we have a collection of papers that don’t have any insights. We could use this training data to see if we can build a model that can predict whether a new paper might get an insight. This might provide a way to give support to the human curation process of picking papers that get insights by pointing to new papers that have insight-like characteristics.

Predicting highly read papers

We could feed reading data about papers with author, paper or institution data to see if we can predict which of our papers might be read widely in the future based on any features of the papers that we can identify. This might give our marcomms team advance warning about papers that might be worth investing attention in.

Predicting whether a paper will get accepted or rejected

Since we have accepted and rejected manuscripts we could use this information to create an acceptance model. The hope would of course be that this model will be useless, and that only full proper peer review can do the job of deciding whether a paper can be accepted or rejected. It would be interesting to see if that null hypothesis holds, or whether we might uncover any bias in acceptance.

An interesting side effect of attempting to create these models might be the creation of a requirement for the kinds of API endpoints that we might like to be able to make available to access eLife content, in order to hit this machine learning service.

Other thoughts from the Keynote

It’s increasingly common to have a pattern where state information about the system is coming from many different sources, whether that be information about user behaviour, inventory or resources. In this world how one manages ones single source of truth becomes an interesting question. It seemed like some companies are using Amazon Redshift to manage this issue.

Sessions

After lunch I managed to get to three sessions.

Amazon Container Service

This is a service for hosting docker containers, and deploying them in an efficient way across a cluster of Ec2 instances. The container service is responsible for ensuring efficient allocation of resources is taking place, and it can provide a monitoring and control layer for your containers deployed at massive scale. The requirement is that you have to build your Ec2 instances with an AMI that supports the Docker protocol. You can use a schedular provided by Amazon, or you can use your own scheduler which understands you business needs.

A philosophy behind using containers is that instead of patching software in production, you get to a point where you development pipeline outputs new containers, that are versioned, and you never patch software in a production environment, you just swap in a new container, making it easier to go back to an old container if you need to.

The main AWS talk on this topic was mediocre, and mainly just a product pitch, but after the head of automation from Hailo talked about how they had adopted this service. For me the most interesting thing from his talk was how he described their move to a micro-services architecture. They run about 200 independent services at any point in time. He didn’t go into detail, but he described how each service gets deployed in a way that automatically provide entry points to the developer for logging, AB testing and test coverage. That means the application developer can spend most of their time working on the business logic.

It got me thinking about our elife-bot. Our elife-bot has a master controller in the cron.py part of the bot. Also, each of the components are tightly coupled by being in the same codebase, and by having coupling via the SWF task queues. If elife-bot were truly micro-services architected, then we would be able to deploy an update to any single task on our processing pipeline without affecting in any way any of the other processes, other than via the data that gets passed from one process to another. At the moment all of these process are deployed in the same repo to the same Ec2 instance. I’d like to see us move to a situation where they are more separated than they are now.

In the Q&A the advice on how to do this came down to seeking a component of your system that can be removed with low risk from your system and put that in to a container.

Another approach is to look at using the Amazon Lambda service, and I happened to go to that as my next session.

Amazon Lambda

I went into this session with a skeptical frame of mind. Lambda allows you to put code into lambda, and that code is run when a certain event happens. You can trigger these events via a number of routes, including changes to an S3 bucket, or modifications of a DynamoDB table. You can get other AWS services to trigger a lambda function by getting cloudtrail to log to an S3 bucket, and have that bucket trigger the lambda function (I would expect native support for other services to be released in due course).

What is interesting is that you only get charged per 100ms that your function is running, so you don’t have to pay for an Ec2 instance to be up and idly waiting for an event to come along if that event can be sent to a lambda function.

There are some constraints, lambda functions only run on a basic AWS Ec2 instance, and if your function takes longer than 60s to run it will get terminated.

In spite of those limits, most event driven publishing workflows could be modelled fairly well using the lambda service, and it could lead to q responsive low cost service.

The speaker in this session was excellent, and he outlined some really compelling use cases for lambda. One that caught my eye was the one of creating thumbnail images. An image file hits an s3 bucket and that could automatically trigger the creation of a thumbnail of that image, and the population of another s3 bucket with that thumbnail.

When I first heard about lambda as a service I was quite against it, as I thought that is seemed to be just a bit too much magic, and it seemed to be a service that would be hard to move away from. I mentioned my concern to someone at the summit, and their response was that “it’s just running an event driven node.js service, I could set that up myself without too much difficulty”. So it seems my fears of lock-in are a little overblown. Yes, it would take a bit of work to extract oneself from lambda, but no, it wouldn’t be impossible, though it would likely lead to a cost increase.

Given what I saw in this presentation, and given some further thinking on lambda, I’d probably be quite willing to try it out now.

Elastic File Store

The next session I went to was on a brand new service from Amazon - Elastic File Store. This is basically a NAS that can be attached to VPC. The presentation was OK. I cam out wondering whether one could connect to this NAS system from outside of an AWS VPC, i.e. from a local computer in the office. We have a use case for that at eLife, and I was unable to determine whether we could do that from the presentation. I think the thing here is to sign up for the preview service in order to find out more.

Deep dive in the aws-cli

The last presentation that I attended was a deep dive into the aws-cli. It was a great presentation, and the most hands on of the day. The aws command line interface supports querying json output using the jmespath syntax. I’d not heard of this before, but it looks amazing, and it comes with a terminal tool that can be used to explore JSON output which can be pipped into the tool. You can get the terminal tool via

> $ pip install jmespath-terminal

On of the other neat features of aws-cli is that you can pass a JSON config file for fine grained configuration of an AWS service, and the cli tool will also generate a skeleton JSON input file for your given service on request.

Final thoughts

I came away from the summit wanting to explore and learn more about the following services:

  • Amazon Redshift
  • DyanmoDB
  • Lambda
  • Elastic File Store
  • Amazon Machine Learning

RIP Terry Pratchett

I must have been fourteen, fifteen years old, I was passing through a train station in Northern Ireland and I picked up a copy of Mort, from that point on I was hooked. How could it not speak to me, a gangly awkward teenager, trying to find my way in the world. I devoured the books as they came out, within a few more years I had a healthy stack.

By that time I’d started going to scifi conventions and the great man was doing a signing at one around the corner from where I grew up. Diligently I brought my stack, stood in line and he looked up, saw my stack of books. I said “I hope you don’t mind”

“I don’t.” He said. “It’s the people behind you that you have to worry about. What’s the dedication?” “To ian” I said.

And so it began

“To Ian”
“To Ian, best wishes”
“To Ian, superior felicitations”
“To Ian, with the return of the best wishes from 20000 leagues from beneath the sea”

Years and years later, nighttime drawing in, I’d read those books to the person who is now my wife. Soon her knowledge of all things disk outstripped even mine, then the audiobooks. It’s a surprise to me that our son escaped being called Horace.

Some thoughts about product management

I moved into digital product management in 2007. I had no formal training, and for much of the last eight years I’ve been learning on the job. There are a huge number of resources out there, great lectures, books, conferences, blog posts. In this short post I just want to reflect a bit on what I’ve learned on this topic through direct personal experience. I continue to learn, and my thinking continues to evolve, so this post is more of a look back, than a look to the future. I’ve pinboarded a few of my favourite links under the tags product development and product management. I also highly recommend Marc Abraham’s blog , he writes frequently on things he learns and on his experience putting those things into practice.

So what is product management/development, can we define it?

When I first heard the term I thought that there might be a single thing that one did as a product manager. However what one is dealing with at a fundamental level is an intrinsically complicated process involving many different moving parts, from people, to markets, to technology. It requires many different skills with opportunities for contributions from many different quarters, and as a result cannot be reduced to a one dimensional description.

I really like 360 view on product management in Roman Pichler’s post about what is product management, and it’s the kind of view that I have now internalised, but it took me a few years to get there.

Anywhere people want to get things done, there is room for someone to look at the process with product development glasses on. You often hear this as referred to as taking the voice of the user. In addition to this I think we can break it down along another axis, one based on efficiency of effort. In recent years this has been codified under the lean heading, lean startup, lean UX, lean enterprise.

Are there any commonalities around what we do?

I think there are, I think that at any moment we need to:

  • figure out what the best thing to do is
  • do that thing 

We often fail at both of these through a combination of doing things that are just the next thing in front of us, doing things that seem like a good idea but that have little evidence to support them, and then taking these sub optimal ideas and failing to execute on them. This is not a situation to bemoan too much, it emerges as a natural consequence of human optimism and desire to be productive, combined with operating in an environment where there is far from perfect information.

There are two pathologies that I want to call out as being especially harmful

  • working on a project or idea, and then changing your mind later and throwing away all of that effort without getting to the point of deliver
  • doing a lot of effort, and in the end building something that no one uses

The first of these sucks a lot more than the second one. At least with the second one you have a chance to learn something, and maybe get it better the next time. 

What can help navigate us through this fog?

I’ve just started reading lean enterprise and the authors make an early analogy to the uncertainty we face in creating systems and products with the uncertainty of war. They talk about the fog of war and how to mitigate against it. I think it’s a very powerful analogy, and so the question in front of us is what activities or tools can help us to navigate through this fog in the context of product development?

  • Create a sense of direction.
    You want to have confidence that you know the right way to be facing right now. That might change soon, and the right thing might be finding out more information, reducing uncertainty, or re-plotting your course, but you want to avoid rudderlessness, you want to avoid drifting. It’s great to be able to identify when you need more information and to be able to act on that. It’s OK to be in a state of uncertainty, but it’s less OK to be blind to your lack of information or direction. There are many tools to help with this, from vision statements, roadmapping, using a ticket triaging system, user research, business model generation, lean experimentation, MVP. They can be applied at different stages in the product and business lifecycle, but they all attempt to give you guidance on what you should be doing right now.

  • Find out what’s working.
    You can’t do all the things (though you should automate all the things!!), and every thing that you do creates a level of maintenance that you have to support, and it represents potential technical debt that could get in the way of the next thing that you want to do, so after you do something, try and figure out if it’s doing what you expected as well as you expected. Analytics, marketing, user testing, more user testing, revenue, buzz are all guides to this.

  • Feed back the stuff you have learnt into the next iteration.   Learn and iterate, you know what you should be doing, you know how successful your last iteration was, take what you learnt into your next iteration. One thing I’ve been often guilty of is releasing a thing, and leaving it and moving on to the next thing. Products mostly get better when they evolve, and systems mostly get more efficient and fit for purpose when they are given the freedom to change in the face of changing needs or new pain points (this often requries refactoring, and you should not consider refactoring to be an evil, but rather a consequence of working with software). Developer pain, user pain, 5% time, MVP, releasing not fully finished products early, show and tells, design crits, code sharing, code review, BDD are some of the tools that can help here.

Of course I’ve just described in a very loose way a rough workflow based on agile and on the lean manifesto. All I can say here is that these are lessons that I have learnt through direct experience, mostly though not doing these things.

My greatest hits/mistakes

Here are some examples of product management koans that I’ve internalised as a result of a direct experience. I might be wrong on some of these, and some of these might no longer represent as much of a risk as it once did, but these are where my current natural baseline of thinking sits right now, and so it’s useful to be able to describe them, so that that they don’t remain as sacred cows, but can be farmed, and perhaps taken out and replaced at some point by more useful lessons.

  • Focus on shipping product
    At Mendeley we had three core products, web, desktop and data services. The desktop client had a slow release cycle compared to the other two product domains. Coordinating features across all three components was a challenge, and often we would drop work mid flow to focus on a new idea. This led to waste, frustration, and sometimes disagreements about priorities. We recognised that getting to a situation where we could ship updates sooner would be critical to overcome this.

  • Make the work visible
    Sometimes work gets dropped because no one knows that it is happening. Making work visible through card boards, kanban, good retrospectives, or introspectives can be a real boon to not allowing work to go to waste. Often I would discover at Mendeley that there was a lot more effort than I had realised was going on around a feature, only after we had decided to modify or drop that feature, and when I made decisions to switch priorities I was annoying developers more than I had realised.

  • Keep your developers happy
    Get them into a position where they can ship, remove technical debt where possible, keep scope clear and small and provide them as much contact to the end users as you can. Again at Mendeley we sometimes had occasions where the engineering team were despairing at some of the internal issues they had to deal with, while at the same time our users were ecstatic with what we were doing. There was a gap there. The story that gets built up internally about a product can often be very different from the story that the end user has created about that product. Reconciling these world views is only ever a good thing.

  • Keep track of user pain
    A really great way to negotiate around what debt or bugs you deal with is to track user pain. It can be a good idea to roll developer pain into this too. Having a sense of what is the most painful aspect of the product, and continuously working to remove that gives you licence to negotiate leaving some of the less painful issues in place while you tackle the more painful ones. When we adopted this process at Mendeley it led to a significant increase in the overall quality of the product. We were able to focus piece by piece on the most important areas, rather than rushing to put out fires all over the place.

  • The human is harder than the technical
    Every time that I have worked on a multi-company product or project, the overhead of communication and negotiation has always been much larger than any of the technical aspects. Be prepared, and bring your best patience pills with you.

  • Test with users
    I’ve been guilty of shipping features without testing them, and I’ve been guilty of having grand plans and ideas that in the end didn’t make any different to product adoption. The worst case was when working on Nature Network. We went away and created a business case around where we thought scientists might want to use in a product, then we locked ourselves in a room for several months writing detailed specs, and wireframes, and we launched after only doing some focus groups. No iterations were tested with real users, the product was a turkey, and was killed a few years later.

  • don’t use InnoDB for heavy read write apps
    On Connotea we had a write heavy and read heavy app. The DB became a huge bottleneck, and the app slowly withered without the internal expertise or will to re-engineer it. This was not the only reason for the ultimate failure of that app, but it didn’t help.

  • don’t over normalise your database, think about application views rather than pure data modelling   Again, with Connotea the DB was perfectly normalised, but that lead to excessively large queries when generating some of the most popular pages on the site. The data structure, when thinking about it as pure data, was perfect, but it didn’t match the daily use of the system. Start with the user, and work backwards, not forwards from the data.

  • a product is not enough, it needs a supporting business model   This was the real reason Connotea failed (in my opinion), and I hold my hand up in a big way on this. We had great adoption, a great product, and were early to market, but I was unable to make the case on how we could get to sustainability with Connotea, and so it never got the support that it needed. That was mainly due to my inexperience, and it remains a huge lesson to me through to today.

  • a business model is not enough, it has to scale 
    We had a business model at Mendeley, and it was starting to get to scale, however there was certainly a lot of discussion around which approach to take, to seek continual individual conversions to a paid version vs going after the institutional or SAAS model. These two different business areas would require potentially different feature sets. We took on the task of pushing the product forward on all areas, we could potentially have made more progress with more focus. What business model will work is never a given, but if you have the opportunity to learn early then that can be a great boon to deciding product direction.

  • don’t leave your biggest opportunity on the table   With Nature Network the obvious product feature would have been to create a profile for every Nature author. Boom, you suddenly have the most influential online profile representation of science. This was too difficult to achieve at the time that we created Nature Network due to political and technical challenges, and we didn’t do it, but it still craws at me that we left that on the table.

  • don’t be afraid to work outside your comfort zone When we launched eLife we worked with a design agency with no previous background in working in the STM sector. They had a lot of experience on commerical web sites. We ended up with a site design and interaction elements that have been widely copied, and highly praised.

  • even with API’s - think of users over the technology stack 
    At eLife when we launched we launched with an API built on top of fluid info, because I saw great potential in the technology, however the API interface was cumbersome, and the service response fairly slow. We didn’t even built on top of it ourselves for internal purposes. We have dropped it, and are starting again, but building out from the service calls we need, as we need them. This was not the first time that I fell into this kind of trap, I had been an early advocate of google wave, an example of a great technology in search of a user problem. Start the other way, with the problems you want to solve for your customers, and bring the technology to those problems. I often suspect that the semantic web is a solution in search of a problem.

  • when you find really talented people, figure out how to get out of their way
    Also give them the space and freedom to do amazing things. eLife Lens is the result of such an approach, and remains one of the best products that I’ve been involved in, even though my involvement mainly consisted of getting out of the way.

Wittgenstein and Physics, Cross College Oxford one day seminar

fjord from flickr user peternijenhuis

(image via peternijenhuis)

Last year I attended a one day seminar on Wittgenstein and Physics. It was held at Cross College Oxford and was the first in a planned series of talks on the history and philosophy of science. It’s been a long time since I’ve done physics seriously, and longer still since I took classes on the history and philosophy of science, so I attended very much in the mode of the interested outsider. As such my notes should be taken very much as a personal reflection on the talks, and I’m confident that I was missing quite a bit of underlying background to totally get everything that was being discussed. Nonetheless I thoroughly enjoyed it. In the notes below I’ve pulled out quotes as they happened, and I don’t intend to weave them together at all into a coherent narrative.

Rupert Read - university of East Anglia - How to admire science and dispose scientism - scene setting

application of the scientific method outside it’s proper home is problematic

particularly problematic in the domain of philosophy

The alternative to scientism is that other disciplines should be seen to be something different. These other disciplines ought not to be assimilated by the discipline of science.

(There are a lot of “air quotes” going on, a visual counterpart to markdown, perhaps?).

(I have a lot of sympathy with most of the views outlined in this presentation, in regards to having a good and balanced and fair understanding of the correct domains to which the scientific method should be applied, there is no doubt but that science has led to great success. There is no basis to apply the scientific method directly to existential questions. These questions can be informed, but not resolved, with the information that is exposed to us through science. On the topic of mind and body, even this question is a specific and defined and outlined question. I believe that we will create consciousness of some bound or limit, within my lifetime. I’ve just read about the embodiment of a worm conciseness in a robot through the replication of the connectome within the robot - this has to be considered a stunning result, but it remains, in my mind tangential to the questions that I think are being raised by Wittgenstein’s concerns over the over-assimilation of the scientific methods into other disciplines - that deserves more explanation on my part, and I’ll have no time to expand on that).

(It concerns me to accept a phrase that states that philosophy is purely descriptive, while at the same pushing to move philosophy away from science, as to me much of science is so much about being purely descriptive of the world, especially in the life sciences.)

today scientism is as strong as ever, arguably stronger

the movement of geeks, humanists, skeptics look like a move towards worshiping science (note: my comment here is a very paraphrased version of what Rupert said.)

the move to evidence for everything.

the idea that art is nothing more than for entertainment

This comment about art has at it’s base that Wittgenstein could see how important art was, and we have perhaps lost that in modern times, relegating art to the domain of pure entertainment. (there is a robust discussion about this point in the Q&A session, and I had the thought that this line of argument it could be drawn out by looking at how research councils are increasingly requiring justification through an impact agenda, through the request to see outputs, outreach and impact criteria being met. I chatted very briefly to Rupert about this just before the lunch break. He made the good point that the impact agenda has the good aspect of urging the research community to connect more directly with the public, and I concur with that, but it’s interesting to me to see so many scientists seeking to make justification for what they would term blue skies research. In addition we know that there are distorting measures in place, such as the journal impact factor, and I feel that what we might need is a better language for the public discourse about the nature of the goods that the arts and sciences provide to society. In a way not only are the arts suffering from an acceptance of scientism in how we think of impact, but so are the sciences).

## Susan Edwards-McKie - Wittgenstein’s solution to Einstein’s problem: Calibration across systems. This is a deep piece of scholarship on the reconstruction of a manuscript version of the philosophical translations. The scholarship stands on it’s own, but the question that I’m interested in is how does this change our understanding? I believe the point being made is that this hidden revision provides access to more of Wittgenstein’s thoughts on topics of mathematics, physics, quantum mechanics and causality, than if you take the existing published work on it’s own.

It is also interesting to me that so much scholarship can go into understanding the minutiae of a manuscript. What is the future outlook going to be for digital scholarship and reconstruction of thought in a world of almost infinite versions (every keystroke of the document that I am writing is getting backed up and versioned).

I’m unfamiliar with Wittgenstein’s criticism of trans-finite mathematics. (I honestly don’t know what I think about trans-finite mathematics, but I do know that my own biases and intuitions are frequently confounded by both nature and my own understanding of nature). (Since the seminar I’ve found a good introductory article, and having read through it, I’m not 100% convinced by Wittgenstein’s criticism, but I feel can understand it).

Again this question of whether a machine can think comes up. Can a machine think? For a certain definition of thinking the answer is yes.

The idea of infinity as a property of space

space gives to reality an unending opportunity for division

Wittgenstein gives to space the property of infinite divisibility.

She mentions a 1929 article - some remarks on logical form. I think it’s mentioned that this article offers some argument for the articulation of the infinite without an appeal to either infinitely large or an infinitely small numbers - and through this provides a critique of Cantor.

There is an in-depth conversation on the fragment that contains some diagrams of circles and squares, these are very rough sketches that are representing a system of thought from Wittgenstein, but it is difficult for me to appreciate this diagram, or the claims made about the diagram. When thinking about diagrammatic methods I’ve been a fan for a long time of some of the work by Penrose, and also some of the graph-theory decompositions, but they come with more infrastructure around them.

If we talk about heaps, and paths through them, and diagrammatic methods, then I find the work of Feynman with his diagrammatic method for QED calculations to be the benchmark here.

Wittgenstein’s’ system is never point based, it is interval based In a system like this one needs a different mechanism for coordination.

There is a mention of a universal oscillating machine, and it’s mentioned that this idea has been developed in late 20th century cosmology. I would like to get more references on this, as a naive participant in the meeting, and an ex-cosmologist its not immediacy clear to me what cosmological theory is being referred to here.

special relativity is a machine with a kinematic basis, general relativity is more of a geometric machine

(I think that is a spot on observation, however the geometry of physics approach of building up structure through the addition of geometric machinery allows one to have kinematics, electrodynamics and all of the known conservation laws just drop out from the machinery, so perhaps the fundamental distinction between these descriptions is not so deep)

There is a discussion in the Q&A on whether Wittgenstein really made claims on insights into quantum entanglement - which seems to be the claim of this talk. There is some push back from a person in the audience. There is some discussion on how intervals, metrics and points are described and determined. I would concur with the person asking the question, my own understanding of metrics comes from some courses on geometry and general relativity.

The speaker says that the hidden revision - with some fragments from Wittgenstein - forms the solid basis for Wittgenstein’s contributions to ideas about entanglement, but there is a case to be made that our current understanding of actual entanglement have evolved from a very concrete basis.

Overall this speaker failed to convince on the topic. It is hard for me to accept that an unpublished and minor fragment of a document on it’s own can be revelatory to the extent that it could support a new view on a topic - quantum entanglement - that now has a very solid basis in both theory and practice. For me it would need to provide a way to get to a critical experiment, or to put it another way, it would need to provide a way to produce a proposition that could be compared with truth states in the world around us, and it seems that there is not enough presented here to make that step.

Carlo Penco - Wittgenstein’s Mental Experiments and Relativity Theory

This is a great talk, you should go and watch it.

Einstein’s work can be seen as a work on the tools with which we describe and compare events - that is a work on the behaviour of clocks and rods in different coordinate systems

Wittgenstein’s work can be seen as a work on the tools with which we describe the world - that is a work on the behaviour of concepts in different conceptual frameworks

these tools must be … coordinated and rigid

Can we push this analogy any further? Alien language games are games where our intuitive rules break down, like where Einstein’s thought experiments pushed behaviour towards the speed of light to break our concept of clocks and rulers.

Einstein found the right invariants to make this work. Where can we find these kinds of invariants in Wittgenstein’s work?

There are different kinds of alien games

  • different facts of nature
    • 2+2 sometimes gave 5, sometimes 7
  • different behaviours
    • people gave different expressions of pain
    • tribe sells wood by area of the pile, not the amount

There are relativists and anti-relativists in the context of Wittgenstein. How can there be such strongly divergent interpretations (one might say this is natural when we leave the interoperation to academics, much less than to philosophers)

Where are the invariants? We assume that aliens follow rules (Quine is mentioned).

If our concepts seem not to work properly then it probably means that our translation manual is wrong.

Transformations must explain the rule-following behaviour, the invariant is that the Aliens do follow rules.

This really is a lovely talk. The example given to demonstrate the talk is one of where a native culture uses a method for assigning a value to a pile of wood - they measure how long the wood is along the ground, rather than the sum total of the amount of wood. This is seen as backwards and odd behaviour by a colonist, who goes on to “teach” the natives the right method to price a pile of wood. What I loved about this example was one of the first questions form the floor. Someone suggested that had a native discovered how to arbitrage that large piles of wood could be laid out differently, then other natives would have discovered that this one person was getting rich, and they would naturally conclude that their system of valuing the wood was broken, indicating that there was indeed a pathology to the way that they did things. Carlo answered that what we really don’t know is what the term “value” means to the natives, is there a ceremony associated with the way the wood is laid out? Is there a moment in the calendar where the natives want to get rid of their money? Are there social contracts going on that we are not aware of. The questioner fell directly into the trap that Wittgenstein says we must avoid, by using a faulty translation, without understanding the entire system of use of the way the language is deployed by the natives, he came to the conclusion that there really is something wrong with the way these people are pricing wood, whereas what we perhaps ought to take away from this is that we probably just don’t understand enough yet of what is really going on.

## Introductory remarks to the afternoon session

It’s mentioned that Dirac and Dyson both met Wittgenstein and had little time for him, but Bohr disagreed with them deeply on this.

## Dr Chon Tejedor - Wittgenstein, induction and the principles of the natural sciences.

She has a book from Routlidge on this topic.

She starts by discussing a view of causality described at the natural necessity view.

There is a distinction between natural and logical entailment.

(I might say that natural entailment appeals to a law of nature to explain the relation between p and q, and as such provides a mechanism that might be said to be hidden to p and q, in contrast with a logical entailment the propositions are structurally related to each other, and the relations emerges out of the internal structure of the propositions).

There is an interesting discussion about how the applications of the rules of logic are independent of the facts of the world, as these rules are applied solely based on the internal structure of the propositions.

(This is where a naive reading of these topics can tie one into knots. I want to look at the boundaries and interfaces between what is considered reality and the logic operating within it, I want to imagine that for the example given of a natural law (magnetism), this law is indeed tied directly to the internal states of the objects forming the propositions (the pin, the magnet), but I have to leave that behind, and accept that, in the context of this talk, that I have to take the topic away with me and use it as a guide for whenever I may return to thinking about the ideas of Wittgenstein in more detail).

One of the things at the heart of the topic in this talk is our common everyday usage of logic in navigating the world, but I’m not at all sure how we do that.

The conclusion that we get to is that we need a non-NN understanding of scientific laws to have a credible view of induction and causation.

There is a positive view on scientific laws in the Tractatus, and we turn to that view now.

The purpose of a law is not to justify or ground a causation, but rather it is as an instruction within a system for the generation of propositions (One might say that propositions as described by Popper should be used to hit the hard bounds of that system, and to see if one can break that system).

it is the sign of a law being at work that our construction of propositions is constrained

the so-called law of induction constrains nothing: it is not a law.

(I wonder, would one say that a law of some systems of mathematics is that they are systems in which induction applies).

Dr. Richard Staley - Boltzmann, Mach and Wittgenstein’s Vienna.

This is a good talk, I’m enjoying this. I didn’t know that Wittgenstein had considered studying with Boltzmann. It was a call from Boltzmann for the need of a genius to help with the development of aeronautics that prompted Wittgenstein to take up aeronautical engineering.

There was a lot of detail about the thought of Mach, much more than I’d been exposed to, and it was a pleasure to listen to. I don’t have any strong takeaways from this talk, but when I get a link to the video I’ll post it, and I highly recommend a viewing.

Climbing outlook for 2015

Alt text

Last year I hit on the best way to set climbing goals (this was after many years of failing to hit my goals). The best way to do it was to set very short term goals, not ones for the end of the year, but rather quarter by quarter, and modify and update those goals as I went. In addition, I found that focussing not on outcomes, but on process, also helped a lot. (This is a bit like the “Objective and Key Results” method of goal setting used at Google).

I was going really well at the beginning of the 2014, and I was keeping my climbing goals updated in my blog, performance was improving then at around easter time I hit a big setback, I started I get extreme elbow tendonitis :/

I was forced to take another long break from climbing, and I just sat around looking at my elbow getting worse and worse. At one point cutting bread was painful, and It got to the point where I avoided picking my two year old son up, as that was also very painful.

Well, long story short, after five months of that I finally made the differences in my life that would enable me to heal my elbow (that started back in September). I sought out a consultation with the climbing doctor – Jared Vagy (probably the best money I’ve ever spent on climbing), and I did a root cause analysis of the problem, and I rearranged my life a bit to fix that problem.

The tendonitis had been caused by a new daily morning routine where I pushed my son to nursery every morning. I had been using a fold-up bike to get to work after leaving him to nursery, so I had to push the buggy with one arm, and that caused the excess strain in my elbow. The solution was to start putting our son into a bike seat which allowed me to bike him to nursery rather than to push him. I also found the right form of exercise to deal with golfers elbow.

Now I’m in a Climbing trip in Spain (as I draft this post) and I am actually able to climb again.

My form is a bit rusty, but a good focus on training for this trip between December and January has paid off and I’m able to get up some routes. It’s been nearly two years since I was last out doing routes. The location is amazing, the people I’m climbing with are a lot of fun and it’s great to just get outside and be able to focus on some delicate moves, listening to bits of my body complaining , looking out over the valley.

By the end of day one of a short trip (two and a bit mote days to go), my key goal for the rest of the trip was to get up a few routes day on day, work on focussing on movement, and getting to a point where I was moving freely on the rock again, and see if I could expand my reservoir of endurance a little. I had many open questions in my mind about how hard I might be able to climb, I was hoping to be able to get up some routes at the top end of my ability, but was also aware that I might get totally shut down either mentally or physically since it had been so long since I was out. I was also aware that my elbow might give in too.

In the end the days were short, each day no more than four routes done. I hit a personal high point, I’ve only ever climbed one route harder, and that was 11 years ago.

Five weeks ago in the gym I was training with one of the other people on this trip, and he said he would not have given any chance for me to get up the route that I’d did, but the main things that got me up the route were the time I put into finger boarding in the past three months, and getting comfortable moving on the rock again. I was able to commit to the moves, just happy to be climbing at all. Hitting that high point was personally very satisfying, and it was a great trip. My psyche is very high again.

I also came away mostly unscathed. By the last day one person had a badly injured shoulder, one had what looks like a tear in a knee ligament, and one found the conditions too cold to be able to climb on the last day. I’m extremely fortunate to come away with some really good ticks and only some tired arms and slightly scraped hands as collateral damage. As I, and those around me, get that bit older, preservation is becoming as important as improvement.

I emailed Jared to thank him for his input, and he replied

You made the changes yourself by committing to fixing your weaknesses and being aware of your body, I just gave you the knowledge and tools to do so.

The 70/90 Rule

I was having a conversation last week with a good friend who works in the financial services sector. We were discussing technical debt, and the tendency of certain teams to want to do everything themselves. He described some colleagues in a team that is tasked with managing a large data pipeline for calculation of risks in a particular market. This is central to the bottom line of the company. Oh, they also used to run their own mailing list server! This was in an organisation that had it’s own very functional and well behaved central mailserv setup. The reason they gave for wanting to run a separate mailing list just for that team was that they wanted the ability to search the archive, but the real reason is that they had fallen into that trap of wanting to roll their own thing. (It turned out that the search functionality had broken back in 2012, but no-one had noticed).

When recounting this to me, my friend remarked

I’d much prefer to use an off the shelf solution that did 70% of the job, than custom build my own system to be able to do 90% of the job.

There are two really smart observations here. The first is that we never get to 100%, we can always think of things to improve on, so rather than going for 100% it’s a better strategy to identify what parts of an imagined system are going to create real value.

The second is that we should be mindful of where our effort is going. That thing that we are customising and configuring - workflow, or emailing list, or phone setup, or multi-core cluster, how much of that effort is helping us get the actual job that we want to get done? How much of it is satisfying our desire to tinker with the solution, rather than engage with the job?

For sure there will be tasks that we need to go all-in on, however we should be really careful to make sure that these are the ones where we are adding the most value.

I got to thinking about my blog. Since the start of last year I’d not been happy with the design, it was not responsive, the typography was messy, and the font a bit too small for my tastes.

I started to tinker.

I took inspiration form Martin Fenner’s’blog, and all of a sudden I was looking at getting into a world of custom CSS, and Jekyll extensions.

Then after the conversation with my friend I stopped and asked myself, is there an off the shelf solution that gets me to 70% of my needs, and that requires absolutely minimal configuration.

I did a quick google for “Jekyll themes” and within an hour had adopted a version of the hyde theme.

The original customisation task had ballooned so much that I’d created a trello board to track all of the things I wanted to do. After opting for the off the shelf solution I was able to drop most of these requirements, and just get back to focus on writing for the blog, rather than worrying about configuring the blog.