SolarPanelDataWrangler

Hey y’all

I figured I’d go ahead and create a thread for my ongoing solar panel locating project hosted at https://github.com/typicalTYLER/SolarPanelDataWrangler (The name is up for debate :slight_smile:, maybe just SolarPanelFinder?)

I’m currently using it to find solar panels Austin, Texas, (which has good mapbox satellite imagery) which is about 5 million image tiles overall, and the current speed of classification has slowed to 15 tiles/s, which comes out to about 4 days of run time total. (I have an i5 processor and a GTX 1080)

The code needs lots of optimization, which I’ve held off on as i’m 86% done with the prototype city. Once this job gets finished I can try to set up a human verification task for the positives (some tests currently underway with this) and try to optimize further without fear of messing up the ongoing task.

In addition to optimization I’m hoping to dockerize the project, add some tests and documentation, as well as some other things tracked in the issues. Hopefully these steps will make the project more contributor friendly, and allow more people to employ machine learning to locate solar panels for various cities around the world.

I’m definitely accepting contributions/suggestions if anybody is looking to get involved!

We work with a local social housing authority here in Sheffield, UK who have deployed a few hundred systems on their housing stock. We know the exact locations, orientations, tilts, num panels and capacities, so perhaps this would make for a good test set? The systems are in a clusters in various suburbs around Sheffield, so should be fewer images to process. Obviously there will be some considerations in that Sheffield looks quite different to Austin from above, but I expect the panels and roofs will look much the same.

Let me know if you’re interested and i’ll ask them for permission to share the system data…

I know @dct is working on building a UK dataset to apply transfer learning to the DeepSolar model (which I use in this project, and was trained on US data), so the data you mention would definitely be helpful to him. This project is currently more geared toward managing the mass application of a classification (and possibly segmentation) model to a source of imagery (currently only MapBox) for a given location, rather than testing and retraining of a model.

In the future these projects might merge, but for now they’re separate.