Goodness-of-fit tests asses and compare the quality of fits
Formula:
$$ \chi^2=\sum_{i=1}^n\frac{(y_i-f(x_i))^2}{\sigma_i^2} $$
Formula is bases on
$$ {\rm Prob}(\chi^2;N)=\int_{\chi^2}^{\infty}P(\chi'^2;N)d\chi'^2 $$
where $n$ and $N$ are different
Setting a threshold of $\chi^2$ or $\chi^2/N$ required taking into account the corresponding probability. As consequence a unique $\chi^2/N$ threshold for all $N$ does not make sense
🚴♀️ Example: When applying a $\chi^2$ test to a binned data set, the number of measurements is the number of bins and the error is the error on the count rate withing a bin, which in most cases is the Poisson error, i.e. the square root of the count rate
Comparing 2 samples with known $\sigma$
$$ x_1-x_2=0? $$
The variance of the difference is
$$ V_{12}=\sigma_1^2+\sigma_2^2 $$
Compare the difference $x_1-x_2$ to the combined uncertainty $\sigma_{12}=\sqrt{V_{12}}$
based on normalised cumulative distributions and evaluating their greatest difference
$$ D={\rm max}|{\rm cum}(x)-{\rm cum}(P)| $$
This needs to be normalised for the same size
$$ d = D \sqrt{N} $$