Soft dependencies without pain
A lot of mods that have soft-dependencies have all kinds of nasty cross-mod issues that cause a CTD at game start, or simply break other mods that are involved in the mess.
Often, the problems only manifest when three or more mods are involved.
The way to avoid these problems completely is quite simple. The problem is created by a handful of bad patterns, and there is nothing mysterious or magical about it.
There is a solid reason for why it happens, and why it can sometimes be quite complex to reproduce, or for a player to fix when setting up a new LO.
In most cases, the cause of problems is a mod that has code to call some foreign mod in a quest script that auto-starts, and it doesn't check whether the ESP is present, it simply calls Game.GetFormFromFile and if it gets a None, it assumes the mod is missing. USLEEP does this FFS! It is the root cause of all that stupid spam it spews on every single game start.
This pattern is toxic. Do not ever do this:
Form foo = Game.GetFormFromFile(0x00123456, "Foreign Mod.esp")
If foo != None
TypeFromSomeOtherMod realFoo = foo As TypeFromSomeOtherMod
... do something with realFoo
EndIf
So what should you do?
Follow the five rules to avoid ever referencing types that don't necessarily exist, and your mod should be much more stable with respect to soft-dependencies, even if you rely on a mod that is a confluence of trouble, such as SLIF.
The Rules
1) Put all the mod interactions for a single foreign mod into a single global function. Do not (ever) mix mod interactions between different mods in the same file!
2) Do not interact with a type from a foreign mod in any function other than a global function.
3) Do not access a global function that performs an interaction with a foreign mod without first confirming that its ESP is present.
4) Do not access foreign mods during first start-up of the game.
5) Avoid access to foreign mods during OnLoad and equivalent events.
You can see this pattern followed properly in Deviously Cursed Loot.
SLD doesn't follow these patterns properly; there are several errors that need to be fixed.
It seems to get away with it anyway, but it's better to be safe than sorry with Skyrim.
But why?
Quite simply, if you reference a type from a script file that doesn't exist, Skyrim doesn't know how to handle it. The type is garbage. It will either CTD, or - if you are lucky - the type will be incompatible with all objects and never take a value apart from None.
To be sure we don't reference types that don't exist, we must make sure those types are not present anywhere in a script we might execute when that type's defining script doesn't exist.
And if that defining type is bound to an ESP, it will also need that ESP, so we should check that exists first.
The type only properly exists when its script files are present, and the best rule we have for determining the presence of the script files, is the presence of the ESP.
This is important for other reasons too, because we need the ESP to populate pre-defined properties on foreign types.
Clearly both ESP and script need to exist, but checking both creates an absurdly tight binding, and is busywork that requires special Papyrus extensions anyway.
It's good enough to check for the ESP. It's not good enough to skip that check, because an object script without its ESP may be parsed but will still break your game because its an orphan: a class that has no possible instances.
Only global scripts can operate properly without their parent ESP or ESM, because a global script cannot be instanced.
By putting the interactions in a global function in a script that contains only global functions, we can be sure that our mod will never try to reference those types unless we explicitly call the global.
The same is not the case for a script attached to a quest. If you attach a script to a quest, its papyrus instance will be created when the quest is started.
If the quest is auto-start, that will be on first load of the game.
So, if you reference a foreign type - even if due to control flow, you do not execute that reference - it is still processed by Skyrim when the potential calling script is loaded.
Skyrim must know the type's signature, and to do that it instantiates the type, and to do that, requires an ESP.
Using global scripts stops this, because they are not pre-instantiated, and they are only instantiated when executed, and executed when explicitly called.
But we must be sure not to call those globals at sensitive times, of if the scripts (and types) they rely on do not exist, or cannot be instantiated.
Thus, we check the ESP of the foreign mod exists before trying to call the globals, and we only call the globals from "safe" functions that will not run on first game startup.
I'm simplifying a bit. It's really only an issue with scripts that are bound to objects, but those are the most common soft-dependency. Any other kind of script would be a global, and would have no persistent internal state and no properties. For such scripts the mere existence of the script object is enough. If it's missing, your game will not explode. Though the code that tries to call it won't work and will crash, merely having a reference to it in your own quest won't cause an issue at all, beyond the crashing of your script that is.
Why must we avoid first game startup?
This is important because the foreign types we want to reference may not even be loaded, even if the ESP exists.
Skyrim loads the types when the ESP loads, and normally that happens in strict load order.
But if we reference a type in our script, it will cause the script that defines that type to be loaded before the ESP that instances it.
This is not well supported in Skyrim.
It seems like there is a bug in the on-demand loading of types.
The first mod to reference a foreign type will succeed - if the type properly exists that type will be loaded (out of order) at that time.
However, the second mod to reference the type will crash in flames.
My belief is that there is a flag that indicates the script is loaded, and that is set after the first load, but the ESP hasn't been loaded, just the script.
There is a special path that handles the unloaded case, and prevents it all blowing up, but that path is skipped when the flag indicates the script is already loaded.
The result is that the first consumer succeeds, the second consumer (and subsequent) fails, until after the parent mod for the referenced script has been loaded via the normal process.
I suspect also that this can lead to an improperly formed VMAD for the script. The script is loaded, but has an empty VMAD, rather than the proper loading process, which sets up the VMAD and loads the script into it. With a bad VMAD, the script never has anything but None in its object properties. I suspect that trying to reference this broken VMAD prior to loading the parent mod results in a CTD.
This explanation is somewhat speculative, but fits the observed behavior.
There seems to be some indication that there was an attempt to solve the dependency problem by loading on demand, and that code is broken. When exercised there are numerous ways it can leave the game in a corrupt state. On the other hand, if it's used only once, and the load was satisfied, you may get away with it apart from a mod that complains about mismatched types for no obvious reasons, and fails to load its quest properties.
Either way, we don't know the load order, and whether the other mods have loaded yet. This is why changing load order can sometimes fix issues.
If your mod is not following the pattern cleanly, but loads really late, it might get away with it anyway.
OnLoad is an edge case. OnLoad doesn't run on first load (strange, I know, but I believe this is on purpose so OnLoad can rely on initialization).
So, mods that reference foreign mods in their OnLoad should be safe? They are. Mostly.
In some cases, a player adds a mod during play. That mod may then get picked up by a "naughty" soft-depending mod that checks for soft-deps OnLoad, but the mod just added by the player might not have properly loaded. This is why checking in OnLoad is hazardous. If you don't add mods mid-play, it will be fine. And if you do, you may get away with it if the mod gets properly loaded before the OnLoad for the depending mod runs. SLD seems to get away with it in practice, but you shouldn't push your luck; I see the SLD behavior as a bug that needs to be fixed.
Practical Examples
So, where should we check for foreign mods? I believe the correct answer is in an OnUpdate handler.
When OnLoad runs, it should set an int "checkSoftDeps" variable to 1.
It should also set the presence of ALL soft-dep mods to false - keep a boolean flag for each one.
When OnUpdate runs, it has a branch that runs if checkSoftDeps > 0
Inside that branch it increments checkSoftDeps
If checkSoftDeps > 2, then run the handling code to check for soft deps, then reset checkSoftDeps to 0.
This makes it impossible to run the checks prematurely, but only requires one variable.
Inside the soft-dep check handler, we need to see if the ESPs are present.
If the ESP index is not 255, then it is, and we can set the presence flag to true.
e.g.
In OnLoad:
checkSoftDeps = 1
slifPresent = False
aproposPresent = False
In OnUpdate:
If checkSoftDeps > 0
checkSoftDeps += 1
If checkSoftDeps > 2
slifPresent = 255 != Game.GetModByName("Sexlab Inflation Framework.esp")
aproposPresent = 255 != Game.GetModByName("Apropos2.esp")
checkSoftDeps = 0
EndIf
EndIf
In code that uses apropos:
If aproposPresent && 255 != Game.GetModByName("Apropos2.esp")
Actor aproposActor = MyModAproposShim.GetAproposActor()
...
EndIf
In the Global function file MyModAproposShim:
Actor Function GetAproposActor() Global
Apropos2Actors actorLibrary = Game.GetFormFromFile(0x0002902C, "Apropos2.esp")
....
Return foundActor
EndFunction
Note how at the point of use, the mod ESP is checked again, even though we have the boolean? That's because the boolean may not get reset fast enough via OnLoad, but it saves us checking GetModByName over and over if the mod is never present.
Note how only the global function, safe in its own file contains an actual TYPE from Apropos2: Apropos2Actors.
When I wrote SLD, I was under the misapprehension that simply not using a type was enough. That is wrong.
If the type is not loaded, and you load a script that references it, you can break the game. Not necessarily though, and that is where the complex results that Monoman wrote about originate. See my comments in the spoiler on first and second consumers.
This topic has actually been written about by engine experts, but the posts are hard to find.
Essentially, if you reference the type, and the mod isn't loaded, you will cause the type to be loaded if it can be. If that goes ok, then your mod may not cause a problem, but you may have now caused a type to load outside of the normal order of load ordering ... consequences of this can result in the type itself breaking, but if it ends up trying to load references to ANOTHER type that is itself later in the LO, and that's where things can go bad. If you start this chain, and any type is missing, Skyrim fails to instantiate the type, and code - in some other mod - may then try to use it anyway, and CTD.
Thus if you follow the pattern, you stop other mods that aren't well-behaved, from breaking the game.
Some things to think about...
If we have a type Foo that is defined in a script file, and that script file is used on two different quests that exist in two different ESPs, how does Skyrim know which one you mean if you use the type?
For example a quest type that is a common base used to create derivative quests.
It doesn't matter. The type is the same in either case, but when you deal with a variable of that type, it will come from one or the other ESP, resolving the ambiguity.
If the type was already loaded by a mod, prior in the LO, that will be the source of the type, until another mod loads the same type and overwrites it.
Why then does it matter that an owning ESP exists when we reference a type in a script? Can Skyrim even figure out how to identify the owner?
It probably doesn't, and can't for types that aren't objects: global scripts have no owner. However, non-object types are used in soft-dependencies rather infrequently.
Objects on the other hand, are special. They are easily located, and more importantly, they usually have properties, so Skyrim looks for the owning ESP so it can fill those properties, even if none actually exist. The difference is clear in the ESP in Tes5Edit.
Quests within the ESP have a VMAD (virtual machine adapter) which binds all the scripts, script fragments, and aliases, for the quest and their properties together. The quest is the object, and the types that are on that quest all have that same object instance.
Imagine linking to a DD object... Though it's an armor object, it has scripts on it, and thus it has a VMAD. It becomes clear from inspection than any script bound to an object requires a VMAD - that is what makes the script from functional code into object-oriented-code. The only way for a script to have persistent internal state is to be bound to an object. Global scripts can only store persistent state via files, or trickery like StorageUtil, which are, strictly, external.
19 Comments
Recommended Comments