Skip to content

Breaking: more consistent (dis)aggregation of lookups #914

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

tiemvanderdeure
Copy link
Collaborator

closes #913

Copy link

codecov bot commented Mar 7, 2025

Codecov Report

Attention: Patch coverage is 79.62963% with 11 lines in your changes missing coverage. Please review.

Project coverage is 83.16%. Comparing base (a15ebb1) to head (9b750c6).
Report is 178 commits behind head on main.

Files with missing lines Patch % Lines
src/methods/aggregate.jl 79.62% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #914      +/-   ##
==========================================
+ Coverage   82.32%   83.16%   +0.83%     
==========================================
  Files          60       58       -2     
  Lines        4357     5399    +1042     
==========================================
+ Hits         3587     4490     +903     
- Misses        770      909     +139     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tiemvanderdeure
Copy link
Collaborator Author

Okay I gave it a go now with a more proper fix and added some tests to show what my goal is here.

Some of the tests fail and I'm not totally sure how to fix them yet, will look more at it later.

But it also seems the tests 1) still use some old (deprecated?) syntax (e.g. passing a locus to disaggregate), and 2) mostly test point Rasters, where I think most of the actual usage of (dis)aggregate will be on interval sampling. It's also much more intuitive what disaggregation of intervals should look like vs disaggregation of points.

@rafaqz
Copy link
Owner

rafaqz commented Apr 1, 2025

Sorry havent had time to look at this, did you figure it out?

@tiemvanderdeure
Copy link
Collaborator Author

I figured something out, but if IIRC the thing I figured out would be breaking

@rafaqz
Copy link
Owner

rafaqz commented Apr 4, 2025

We can bundle it with the DD breaking change

@tiemvanderdeure
Copy link
Collaborator Author

tiemvanderdeure commented Apr 4, 2025

Now I remember - we need to choose if we still want to allow Locus to be passed in as the first argument. This still works (and is tested!), but is not documented.

EDIT: it worked, but this is exactly the thing I broke here :)

@tiemvanderdeure
Copy link
Collaborator Author

More specifically my idea here is that the clearest case for disaggregating is with interval sampling. We can visualize breaking each grid cell into smaller grid cells, and that's what I implemented here. Before this PR that would depend on the sampling.

It's less clear how disaggregation should work for points, which is where this locus argument came in - but again, at the moment that is not documented

Co-authored-by: Felix Cremer <felix.cremer@dlr.de>
@tiemvanderdeure
Copy link
Collaborator Author

At the moment the PR will just always disaggregate point from the center. If you think that is a reasonable way to go all we need to do is fix the tests and this PR is good to go. I'm away for the weekend but can do it on Monday

newl = l[start:scale:stop]
sp = aggregate(method, span(l), scale)
return rebuild(newl; span=sp)
if issampled(l) && isordered(l) && isregular(l)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't aggregating Irregular ok in some cases?

Like its a bit wrong to do with mean but fine with sum?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in lookup aggregation. How should we aggregate irregular lookups?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like they were in the past... Start and end locus are especially simple. It's just dropping the other values.

Center is a bit more complicated, I guess the middle of the aggregated range?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an extra check for regular - so for irregular lookups it works basically like it used to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually we might want the old behaviour for point lookups as well - I'm not sure

@tiemvanderdeure
Copy link
Collaborator Author

I think this PR is good to go now.

All the arithmetics are super annoying but just to show what this PR did, let me plot some rasters before/after (dis)aggregation.
Before this PR:
billede
After this PR:
billede

Code here is just:

using Rasters, GLMakie, Rasters.Lookups
ras = rand(X(Sampled(1:10; sampling=Intervals(Center()))), Y(Sampled(1:10; sampling=Intervals(Center())))) |> Raster
function plot_outline!(ax, ras; color, kw...)
    bnds = Rasters.intervalbounds(ras)
    bnds_A = tuple.(bnds[1], permutedims(bnds[2]))
    broadcast(bnds_A, color) do bnds, color
        (x, y) = bnds
        poly!(ax, [(x[1], y[1]), (x[2], y[1]), (x[2], y[2]), (x[1], y[2])]; color, kw...)
    end
end

rasag = aggregate(mean, ras, 2)
disag = disaggregate(rasag, 2)

fig = Figure(size = (1000, 500))
ax = Axis(fig[1,1])
plot_outline!(ax, ras, color=parent(ras), strokewidth=2, colorrange = extrema(ras), alpha = 0.5)
plot_outline!(ax,rasag, color=parent(rasag), strokewidth=5, colorrange = extrema(ras), alpha = 0.5)
ax = Axis(fig[1,2])
plot_outline!(ax, disag, color=parent(disag), strokewidth=2, colorrange = extrema(ras), alpha = 0.5)
plot_outline!(ax, rasag, color=parent(rasag), strokewidth=5, colorrange = extrema(ras), alpha = 0.5)
fig

@rafaqz
Copy link
Owner

rafaqz commented Apr 9, 2025

Would you say breaking change or bugffix?

@tiemvanderdeure
Copy link
Collaborator Author

I'm not even sure. Some of the changes for point sampling might be actually breaking? But some of that behaviour was already deprecated

@tiemvanderdeure
Copy link
Collaborator Author

But if there's a breaking release of DD on the way we might as well just add it to the breaking release here?

@rafaqz
Copy link
Owner

rafaqz commented Apr 9, 2025

Yeah, it makes sense I'm just not sure when it will all happen

@rafaqz rafaqz changed the title Disaggregate shiftlocus Breaking: Disaggregate shiftlocus Apr 14, 2025
@tiemvanderdeure tiemvanderdeure changed the title Breaking: Disaggregate shiftlocus Breaking: more consistent (dis)aggregation of lookups Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

disaggregate does not preserve extent
3 participants