Discussion before the meeting
Plots with floating normalization
Louis: Trying to understand the large chi2, I switched to scaling the prediction to the integral of the data. This is a global factor (may differ for each chi2 computation) in addition to the y-dependent scaling I had been using. The justification is that the total cross section is dominated by the medium-pT part. We know that there are other problems there, and blame MadGraph. The factor is 1.097 when fitting all y bins together. There is a slight y dependence but no trend (2% at most).
This is the new "global" fit. We get a chi2/ndf of 2 instead of 3, and the p-value is 2.5%.

I also made the y-dependent plots, split and grouped:


So what's happening? We're being hit by the 2nd bin in the 3.5-4.0 region. It's significantly higher than the rest.
Less correlation
We have the following systematic uncertainties from LHCb. When combining their measurement between data-taking years, they consider them either correlated (C) or uncorrelated (U).
- Luminosity (C)
- Efficiency (U) (provided with a correlation matrix)
- Background (C)
- FSR correction (C)
- Efficiency closure test (C)
- Alignment (U)
- Unfolding (U)
So far I've taken everything fully correlated between bins, except for the efficiency for which I used the provided matrix.
I am of the opinion that the background, FSR, and unfolding uncs may not be 100% correlated between bins. Decorrelating them completely helps a bit, bringing the p-value to 6%.
It would also not be outrageous to use the efficiency correlation matrix for the closure and alignment uncs. The reasoning is that they probably depend on where the muons go in the detector. Doing this has no effect.
None of these changes shifts the minimum of the chi2 curves significantly. It is always at 1.0 GeV for most y ranges.