Well, the story is there (introducing a setting, building up tension and a surprising ending). What you need is to support the text with pictures.
For this you could use in-game footage and a narrator (e.g. clan leader pespective) to interpret the yet unknown game mechanics for the audience. If it's cut to keep up with the pace of the text it may fit in less than 4 minutes.
Example: "I decide to seek out the rest of the clans in the area so I can figure out the best path to victory for this world" -> camera starts from hometown and goes over the already discovered landscape including other clans while the off-voice explains what kinds of victories are possible. Camera stops the moment the Mutants are discovered, then switches to the power graph ...