I’m coming over from @dct’s post on the fastai forum geospatial thread. Thanks for posting your info there and the OCF initiative in general.
I took a look at the SolarMapper paper and code - while the repo @TylerBusby333 got directed to was hard to follow (looks like it’s the lab’s general tooling for image segmentation on sat imagery, not specifically applied to the SolarMapper project), the paper did lead me to this great open dataset of labeled solar PVs across 4 cities in California on 30cm aerial imagery by USGS which is what SolarMapper is trained/validated on:
Accessing the individual geotiffs and label files as presented on figshare is pretty clunky to work with. I saw that the dataset is openly CC0 licensed so I merged and converted the 4 cities’ imagery into Cloud-Optimized GeoTIFFs, clipped the geojson labels to each city, and put up everything as an experimental labeled SpatioTemporal Asset Catalog (STAC). Note that I JPEG-compressed the COG to get ~15x filesize reductions. There are some visible artifacts, particularly on the smaller rooftop PVs, but I’m interested to see how far we can push a model to detect panels regardless given how much less storage and bandwidth is needed with JPEG compression.
Below’s is a screencap of a small stac-browser instance I deployed to visualize the collection of imagery and labels per city:
The Fresno assets preview page is a bit slow to load at the moment - not because of the 2.3GB size of the geotiff (thanks to COG!) but because the geojson labels being visualized is 7.5MB to load first. Will fix this later…
On early experiments with a sample 5k of the data, 20 epochs, and minimal hyperparameter tweaking to train a segmentation model, I was getting 0.6 IoU on my random 20% validation set (vs the paper’s reported 0.67 IoU in aggregate on 2-fold cross validation across 3 cities - not sure why they didn’t use Oxnard as well).
Created 25k tiles and am in the process of training a segmentation model on the full dataset now. Will update once it’s trained and I can do some model performance evaluation on the object/instance level and try inference on new areas.
I expect it won’t perform very well off-the-bat on new imagery/geographies given the homogeneity of this imagery/locations and the SolarMapper team’s experience applying their model to the state of Connecticut. This is why I’m deliberately using a smaller resnet34-encoder model that can finetune quickly on new data and am interested to explore semi-/weakly-supervised approaches that can make use of roughly correct but not pixel-perfect OSM data like how mappers label small rooftop PVs as points instead of drawing the polygons.
Thanks so much for sharing your progress.
TylerBusby333’s repo is mostly focused on extracting MapBox imagery for the whole of the US. But as you’ve seen, there are some other great imagery sources.
You highlight one of the biggest issues, which is differences in rooftop and other features between locations. I’ve been thinking that it might be worth making a dataset which includes imagery from several cities. The challenge would then be building a model which works well across these diverse locations.
For example, someone mentioned there is 6" resolution imagery for NYC available: https://maps.nyc.gov/tiles/
We could try and find some European cities as well.
Thanks! I haven’t seen the NYC tiles before - looks promising, particularly that it’s every 2 years. Could do some interesting change detection with that.
I’ve finished doing a training run on more data from Fresno and Stockton (and validating on Oxnard, Modesto), and was able to hit 0.68 pixel IoU on Modesto validation data (compared to 0.66 reported in the paper). It’s not quite an apples-to-apples comparison because my validation set is made differently and I balanced the data between positive and negative tiles. When I ran inference on all of Modesto to eye-ball performance on an instance level, recall looks pretty good (correctly segments almost every ground-truth panel) while precision was pretty low (lots of FPs) likely because I didn’t have enough negative examples in my training data. That’s easy enough to fix by adding more negative tiles to train on.
The results were encouraging enough that I went ahead to hack together a small prototype to visualize the model inference results and use that along with some simple polygon edit tools to correct/update the labels against a new base image (Mapbox satellite-v9 for Modesto in this case):
Pardon the UI - it’s messy and overloaded with a few too many things right now. If you do play around with the page, you’ll notice i’ve corrected and realigned almost all of the original labels to match up with the Mapbox imagery (in blue outline vs the red polygons which are the original labels) so I can generate new training data to finetune the model with. Correcting and adding all the new labels for Modesto took ~2 hrs. Note that this is not yet linked to any backend so editing the polygons only persists for the duration of the page. I do include a button to download the edited geojson to local drive and an upload button so that’s one way to save, load, and continue edit progress for now.
I much agree that a more versatile model to handle multiple imagery resolutions, sensors, geographies would be ideal, and key to making that happen is more diverse training data. The idea with this prototype is to generate predictions to guide the creation of new training labels, add that to the training set, train/finetune some more, do better segmentation, and bootstrap our way to a more robust model. In addition to US and Europe, I’m interested to add some data from Asia and Africa as well in this way. There’s a lot of great open imagery on OpenAerialMap to work with.
I’ve finished a quick model fine-tuning run using my new cleaned-up/re-aligned training data over Modesto mapbox imagery and ran inference with it to segment PVs on mapbox imagery over new areas of NW Modesto CA and Queens NY.
Despite very minimal training (finetuned 20 epochs on 1.6k image tiles of Modesto only) and not great eval metrics (pixel IoU of ~0.55), it already looks promising as a visual tool to guide data labeling efforts for both commercial and residential rooftop installations.
Not surprisingly, the model doesn’t do as well in Queens where the buildings have different construction and rooftops are often flat and with different materials. With some new labels created on Queens and other diverse locations, the model should increasingly be able to handle very diverse looking roofs and PV installations.
Explore the results for yourself on the updated prototype page. Click the “Next AOI” blue button to fly to each area and slide the orange slider to adjust opacity of the prediction raster layer:
I pulled the OpenAerialMap (OAM) db and filtered it to find all of the high resolution maps, and then manually selected only the ones with mostly buildings.
You can checkout all the maps here: https://1v88l.csb.app/
If you click on a thumbnail you get taken to OpenStreetMap iD Editor where people can add PV panels.
If we could get a community effort together to map these, we can then export it and add to the training data.
This is great! I wish iD editor had an option to add a custom raster layer with adjustable opacity. Then I would generate model prediction rasters for each OAM image and have people optionally add to their map displays to help guide their PV mapping process. And as new training data is made and incorporated into the model, new predictions can be generated. Unfortunately right now, it looks like you can only add a custom base layer that replaces the other background imagery layers, i.e. mapbox, bing, DG, not add a new semi-opaque layer on top like what my prototype does.
I’ve drawn some new labels based on the Queens area prediction, added that to re-finetune the model (along with Modesto labels) and predicted on a larger strip of Brooklyn, NYC on the mapbox imagery. Shown as a new AOI on the prototype page. Model is still making some errors with particular asphalt-colored and glass-paned rooftops but that’s a function of how much and what training data there is. I think my next step is to create some more labels from this Brooklyn prediction, realign labels to the NYC 2018 open imagery, retrain, and then try running inference on all of NYC on the open imagery.
On the prototype page, I’ve also added the NYC 2018 imagery as a base layer and a feature to see your lat/long coordinates with direct links to the same spot on OSM & Google Maps. The display box shows up when you click on any point on the map. This allows easier cross-comparison and obtaining more info about the building or area in question. It comes in handy when I’m not quite sure if a rooftop installation is a solar PV, a glass roof, or something else.
We’re looking at running our own OSM instance just for solar. So we can consider adding a 2nd layer option as a feature to it.
Really useful to see the NYC 2018 layer for comparison. I notice that there is some offset, so some labels don’t align. Seems to be the neverneding problem with satellite imagery
For the purposes of making useful training data for the community, we will need to keep the labels paired with the satellite source used to create them (so there is no offset).
Do you think it would be worth making a single NYC training dataset using the NYC 2018 imagery?
Also, do you think it would be worth running your trained model against the Open Aerial Maps to find likely candidates for manual labeling? We could then use https://tasks.hotosm.org/ or https://maproulette.org/ to crowdsource the labels
Here’s the latest inference run over a large part of Inner Queens and Upper East Side Manhattan. It’s made with a model that was trained on ~500 pixel-perfect polygon labels I drew over a few other parts of NYC in the visually-aided, bootstrapping manner mentioned before:
there are quite a number of False Positives. Some of this is because the model hasn’t trained on enough negative examples, especially of rarer things like train tracks, complex glass roofs, certain road markings. Some of it is intentional in the design in that I’m prioritizing a model that has higher recall at the expense of lower precision. It’s easier to delete/ignore a bad FP prediction than to have to manually re-search an area for False Negatives that were missed.
shown in the blue editable polygons, I have a “1st draft” of polygons generated directly from the raster/pixel-level predictions. These polys are not perfect in any sense but there’s some promise to starting mappers off with a draft. I’m exploring the threshold in UX between a predicted polygon that’s simple and close enough to correct vs when it would be easier to create the polygon from scratch (maybe because the predicted poly is too far off or has too many vertices to correct as quickly).
show in green outlined polys in different AOIs around the city are the training labels I used to create the model that predicted on the Inner Queens/UES area. These should all be nearly pixel-perfect and aligned to the NYC 2018 imagery.
I’ve added some keyboard shortcuts and other enhancements to the UI to speed up the review and polygon creation/editing process:
“c” toggles the raster prediction layer on and off (to the user-set opacity level when on)
“b” toggles between the NYC 2018 imagery and Mapbox base layers
“3” switches to polygon creation mode without having to move the mouse cursor
shift-mousedrag will select all the polygons within the selection box. Then you can delete or move all the selected polys at once.
the default polygon creation mode is now an “assisted rectangle” which is faster to draw.
@dct - yea, it may be worth having a full NYC 2018 labeled dataset although if I had to prioritize, I would rather first have more labels across a number of diverse data sources than labels extensively created on one imagery source and area.
Inconsistent offsets between sat/aerial imagery sources is always going to be a problem and yea agreed, there needs to be good linking of labels to the source imagery upon which it’s created. This is not currently done very often or consistently in the OSM data - should be contained in the source= tag but often isn’t.
Re: running inference on all OAM imagery, I did some quick runs on a few images to see how this current NYC-trained model does. Looked promising at 1st glance…will update soon with more info once I take a closer look at the results. I don’t know that we need to run a model on every OAM image as many of them are from post-disaster surveys which may be less relevant for training data purposes and other images are too old, visually distorted, or otherwise high quality enough to be useful. Maybe we can start off with manually spot-checking and selecting some imagery that represents a diversity of areas, resolutions, sensors that are recently made, high image quality, and high relevance (i.e. not post-disaster or some other special purpose for the imagery capture).
Here are some NYC-trained model inference results (both raw pixel/raster and polygon/vector-level predictions) on some OAM images I manually selected to be of diverse locations, image capture conditions, distortion effects…
Here’s the latest from NYC: I’ve run the latest version of the model on almost all of Bronx + a bit of New Jersey (~130 sqkm) at zoom level 20 (~15cm) and generated the raster and polygon predictions displayed here:
Model segmented about 2300 distinct polygons, most of which are correct on a quick visual inspection across the area.
Creates almost pixel-perfect segmentation on all of the larger installs:
Not as pixel-perfect but still useful on residential rooftops (and there are more omission errors here given the smaller sizes and hitting the limits of the 15cm image resolution):
And not surprisingly, gets a bit confused about weird boxy shadows, infrequently seen features, the occasional car:
This page is a “edit-able sandbox” in the sense that I’ve enabled anyone with the link to save polygon edits to cloud storage (the floppy disk icon on bottom left) and those changes will be reflected on a page refresh to anyone else with the link.
The brighter green polygon outlines you see in a few areas are visually flagged as manually corrected. You can change the outline color of a selected polygon with the hotkey “d” to green (indicating manual correction) or yellow (work-in-progress) as a way to visually tag and track the progress of your work. Hitting “f” will fly you to center on to a polygon if there’s one in the viewport - another way to speed up the visual search process.
As before, “b” toggles between the base images and “c” toggles the raster prediction layer from 0 visibility to semi-opaque.