Timing and syncing up with the OpenStreetMap community

I’ve started the process of getting the OSM UK community involved. I’ve chatted to lots of people about it, and then I posted an RFC on the talk-gb mailing list. Plenty of useful comments already.

People can join in voluntarily whenever they want. But there’s also the opportunity of having solar as the next “OSM UK Quarterly Project”, if the community collectively votes to do so. (They do this via a Loomio discussion group and vote.) The next one would be Jul-Sep (or the one after that Oct-Dec).

So this raises a timing question: is it likely we could get at least a first draft of a MapRoulette (or other) task set up with “solar panel locations in the UK suggested by DeepSolarOrSuchLike”, online by e.g. the end of May or mid-June?

If so, the community could well agree to use it for the quarterly. (If not, there’s an alternative arrangement: use the quarterly project to get lots of mapping done ad-hoc by the community, then use that as improved training data for ML, after which MapRoulette can run later.)

Thanks Dan! Great to see folks engaged with the idea already. Tyler might have be able to have his USA DeepSolar suggestions done by then. But given the OSM community is global, it would be good to make sure the project is for something everyone can easily contribute to.
Having large and high quality training data sets for everywhere in the world is absolutely crucial. What if the project was just people mapping their chosen/local areas using existing methods? Alongside this, it sounds like there are some inconsistencies with the tagging that could be discussed and improved.

If at the end of the project, there are tons of additional data points in many cities across the world, it would then let anyone pickup DeepSolar or similar and try their hand at mapping larger areas? It would probably mean a more ad-hoc project to help verify those suggestions and add them to OSM.

I’m just aware that if we just proposed mapping the USA DeepSolar suggestions, it might not be so interesting for members outside the USA? Or those wanting to do ground observation mapping?

The OSM community is global but it’s not homogeneous. For most countries there are local “chapters”, and here I’m working with the UK chapter.

The UK “quarterly project” will only be editing UK, and there’s no way to change that. There’s a real energy there to get this job done well for the UK. We can make the most of that enthusiasm, but those same users are not necessarily going to then dedicate their time to the global scale.

The video-call discussion was mostly very UK-focussed, and my impression is that this is good in at least that Jack and others have ideas for how to take the data and get it actually used at UK National Grid or wherever. In the call I raised the question of going global, and we do have this is mind, perhaps as a separate “prong” of work.

What if the project was just people mapping their chosen/local areas using existing methods?

That’s absolutely fine and good. Indeed the UK mappers will have a go at this with the current tools+knowledge, even if the ML predictions aren’t there to help finding the needles in haystacks. So for OSM UK, it’s not a roadblock if there are no ML predictions.

If you’re wanting to know how to do that worldwide, the question is how to facilitate that to happen. In OSM we’ve a LOT of experience of lots of different mapping tasks e.g. the humanitarian mapping - it takes an entire community of dedicated volunteers, with very strong humanitarian motive, years to build up the community and momentum for it. (We have so far mapped only a tiny fraction of the buildings in Africa.) At global scale, we will not be able to do this without some really good acceleration from the ML predictions,

Ah! I had missed that it is the UK community list :slight_smile:

Let’s definitely focus on just manual mapping new points and improving consistency of existing. Getting access to satellite data in the UK that is good enough for DeepSolar, that can also be used license wise in OSM is going to take some time (maybe the rest of this year…?).

But if the manually mapped data in the UK is expanded it will help a lot with the DeepSolar work (by allowing us to build training datasets).

@laurencew just posted a link to https://wiki.openmod-initiative.org/wiki/Power_plant_portfolios and I found this map of UK solar farms >1MW. Might be worth using this as a reference for OSM mapping?

Jamie (Sheffield Solar) will also be sending us a map of UK installations by postcode that’s more accurate than the FiT database. This will be a good guide for OSM mappers.

1 Like

Do you have this map / dataset at postcode level from Jamie? Are you able to share?

We have a list of ~all PV sites that we simulate for our work with NGESO. The “site list” is derived by combining data from OfGEM (FIT database), BEIS (REPD database) and a commercial market analyst firm. It’s not completely exhaustive and there are issues cross referencing the three sources accurately, but it covers much more of the deployed capacity than the FIT database alone.

The locations are limited from each database though - FIT only provides outward postcode, REPD provides a mixture (sometimes full postcode, sometime geocoded from outward) and the commercial data-set gives precise locations (I think). I can’t share the raw data because the commercial dataset is provided under license. I could provide a PV system count by outward postcode, but that might not be sufficient spatial resolution? Alternatively I could aggregate to grid squares using the centroids of the postcodes/outward-postcodes? Or I could just aggregate by postcode where possible and outward postcode where not (using some kind of hierarchical data structure)?

Hi Jamie - do you know whether the “precise locations” are actually created by geocoding of address data? That seems fairly likely to me, and affects how accurate we think they are.

BTW, the FiT data does have higher precision than outward postcode (for others reading along: note that “outward postcode” is the same as “postcode district” which is the term I’ve been using). The LSOA column is more precise, and you could get LSOA centroids from here.

I’d be really interested to know: judging from your list of ~all PV sites, what is the total number of them?

For the purpose of syncing up with crowdsourcing efforts, I’d say that the most preferable would be if you could:

  • Provide a PV system count by outward postcode, and
  • Provide a PV system count by LSOA (this is more precise, but I think LSOAs only exist for England+Wales)

I definitely appreciate that it’d be lovely to have full-postcode (v high granularity) breakdowns to the extent you can do them, but I can’t think of a good way of handling data that has a mixture of postcode-districts and full-postcodes.

In my humble opinion grid-square data isn’t going to be particularly useful to guide crowdsourcing, though it might be quite compatible with machine-vision work. Grid-square data might also be a bit harder to handle privacy properly since there could be plenty of squares with very few entries in them.

Anyone else? Happy to hear a contrary view to mine.

Happy to do this. I can’t recall if the REPD and commercial dataset include the (full) postcode, so may need to reverse geocode, but shouldn’t be a problem.

I hadn’t considered that LSOA’s are generally much smaller than postcode districts - good shout! I don’t think REPD and the commercial dataset give the LSOA, would need to check. Either way I could do a spatial join using the boundary shapefiles from ONS.

I’m super busy atm so will probably be at least a week or so before I can compile the datasets. In the meantime if anyone else has any suggestions please feel free to chip in…

1 Like