So one of the benefits of constraints, are that as you add new constraints, they can further constrain existing variables via constraint propagation. Prior to X+Y #= 6, X had 5 possible values and Y had 2 possible values. Labeling with ff at that point would try and restrict values for Y first as it has the smallest domain. However, once you added the constraint X+Y #= 6, clpfd is smart enough to reduce the domain of X to just 2 possible values. If you run your statement without labeling you will see:
?- X in 1..5, Y in 1..2, X + Y #= 6.
Y in 1..2,
X+Y#=6,
X in 4..5
So now both variables have the same domain size of two possible values. X is 4 or 5, Y is 1 or 2.
Now if you look at the help for labeling options:
First fail . Label the leftmost variable with smallest domain next, in order to detect infeasibility early. This is often a good strategy
Then you should see why it try the values of X first rather than Y.