@sallamander hard agree on chaining simple tasks, I’ve been trying to stay on top of the leaderboard just to make sure the tasks seem doable, etc. And drawing areas from nothing can feel cumbersome, but also it kind of feels like duplicating work to create a node and then go back and create an area from it in a different task. It’s hard to know how the users feel without getting a lot of people to test it. I wrote the instructions in such a way that it’s okay to create nodes but areas are better. But I realize ambiguity might increase attrition as well.
But yeah I am in agreement on having each task be very simple and chained together, this is also conducive to the way that MapRoulette tasks can be generated by queries. For example, you can query solar nodes that aren’t areas, and say like “please make this an area” and then you can query areas without a certain label, and say “please add this one label”. I think the important thing is that each task leaves the pv object in a stable state where progress can be directly queried, otherwise it might be difficult to query things like “loose bounding area” if we aren’t allowed to add tags to represent things like imprecision.
Another problem I foresee though, is that the OSM community is very particular about the quality of what’s allowed on the map.
My understanding:
- Node representing solar panel - okay
- Area representing solar panel - okay
- Not a lot of descriptive labels on a solar panel - okay
- Loose bounding box - probably not ok, they don’t have a way to mark things as imprecise/not finished/not confident and are philosophically opposed to this in my experience
- Directly imported from ML algo - probably not ok as well
- Sourced from proprietary data - definitely not ok
This means that for any sort of data import, found by ML, questionable accuracy, or proprietary source, we’ll always have to maintain our own database of pv locations (whether it’s one mega db that all OCF projects contribute to, or specific to each project used to find locations). And then use human verification to port what we’re allowed to from those to OSM. (And then once it’s in OSM the imprecise data in the staging database would get replaced with the human verified version)
I already have to have this architecture for my project SPDW, it’s just a little more primitive than what I suggest here, but I foresee the pattern of: data source -(machine)> staging database -(human)> OSM, recurring a lot throughout the pv mapping projects. It might benefit the organization to create a single database where all sources get staged in, just to standardize the process for human verification from different sources, having a place where algorithm source and imprecision are allowed tags, and a place where proprietary data sources are allowed. Although the end goal of this is to get all the data we can into OSM.
Anyway, sorry, big tangent. I haven’t had a chance to take a deep look at Zooniverse. I think its added benefits might be additional visibility and possibly allowing us to structure our tasks in such a way that they aren’t directly in the context of OSM.