The 5 watt zener diode is all you need across the base / collector junction of the transistor because the transistor is providing the main current path rather then the diode. I normally work with the MJW3281A transistor. As I recall it's an NPN with a breakdown voltage of 1000V and a power dissipation of 200 watts.
While this is a beefy transistor, you will still have to observe power dissipation when working with very big tubes that could draw upwards of 200 ma of screen current. In this case you stack the transistors in series by connecting the collector of one to the emitter of the next. For currents in this range, I only use 100 volt zeners to limit the voltage drop and dissipation each transistor will have to handle.
Smaller tubes like the 4CX250B can have all the screen voltage handled by a single zener and transistor where the larger ones will require a series string. Since each transistor has its own base voltage established by the zener, the voltage across any one transistor is always the same as its zener. Therefore, you can stack as many as you want to add voltage and the only part that will experience the increased high voltage will be the insulators behind the transistors, isolating them from ground.
The transistor is in a TO-3P case and mounts with a single screw. The case has an internal insulator for the screw itself but the metal back still must be insulated from the heatsink. In any application where there is more then a couple of hundred volts, I highly recommend using thick ceramic insulators and not plastic or mica here. It's also a good idea to RF bypass the base / emitter junctions of each transistor by installing a .01 cap across each one.
Being that this is a shunt regulator, it can be used in the cathode bias circuit on your typical triode amp too. Replacing the zener or string of rectifier diodes we often see. This idea of using a transistor to buffer the current load on a smaller zener also improves the degree of regulation and reduces thermal drift on the zener. The MJW3281A transistor is available online from Allied Electronics at about $2 each. I'd only use the Radio Shack transistors for low voltage cathode biasing on smaller amps.