Thank you for the advice. I will introduce a parameter to do this.
But what's about idea itself? I don't know whether people like this... It
required a little more work on initcall writing.
Maybe we could limit the multithread part in device_initcall? For it seems the
most time consumed here, and the others in total just less than 1s(at least
on my machine).