Here’s the background for this project:
AOHs are the backbone of STAR and LIFE (see A North Star for my PhD: LIFE and STAR (24 Oct 2025))
The LIFE and STAR pipelines produce AOHs for ~30k species.
30k species is too many to validate for correctness by humans.
So Dahal et al proposed a 2-step validation mechanism to flag outlier AoHs.
Model checker:
Does the proportion of a species’ range retained for its range match species of the same taxa, habitat preference and elevation range?
If it differs significantly from other similar species, the AOH should be checked.
Occurence checker:
What percentage of GBIF occurrence records for a given species occur within its AOH?
If too many occurrences lie outside of its AOH, it suggests the AOH should be checked.
This is a sensible starting point for validation. But can we do better? Michael and Aly have several ideas. I also think there are opportunities for LLMs / Agentic AI techniques to be very useful here.
Whatever validation procedure we come up with can serve as a test-bed for habitat maps we create when we turn to Research Idea: Towards better habitat maps using geospatial foundation models).
Proposed methodology: