Connect MS1 drain to Emitter of Q1 (minus input of OPAMP) so that start-up should work.
This can be best analyzed with 'control systems" "feedback circuit" perspective. Your circuit start-up does monitor BG start-up condition but does not control the BG's start-up operation. Start-up circuits input and output is the same point so it will likely oscillate around if BG did not started by itself and loop has enogh gain. If BG started your start-up circuits observes that BG started (M5X will supply steady current and maintain voltage on R2X) and shuts off itself. Otherwise BG may or may not start by itself with no help from start-up circuit. SPICE may show BG starting so simulation result is not trust-able. Paper mentioned could also easily have a drawing error and even there is no drawing error researchers only pick started circuits and measure and publish results Leakages, dv/dt etc plays role in whether BG will start if start-up is not functioning. With this start-up, this BG circuit is not a reliable reference circuit and not commercially viable. Transistor sizes are not important for topologically wrong unreliable un-manufacturable circuit.