Azatthot Posted 14 hours ago Posted 14 hours ago (edited) Hello I do not consider myself a modder but I do know somethings about voice generation - still a noob tho. But I tried making a vomprehensive guide for it For experts and beginner alike I am creating this so we have something to look/guide us if we return after taking a long break from gaming/modding. This is not only for one of the most famous mod "Dragon Born Voice Over-DBVO" but also voice generation in general. 1. Extraction of dialogues I find Xedit .pas scripts to be most effective and nigh perfect for this job. You can extract both mod files dialogues and DBVO dialogues 2. Generating the voices Today thanks to AI we have many options nowadays Elevenlabs is almost obsolete thanks to QwenTTS. I think many new or old modders will generate voice mods of next generation in recent times. The hurdles a.) .lip generation - .lip file is a animation file which tells how the character's mouth moves You can get it here https://www.nexusmods.com/skyrimspecialedition/mods/40971 or https://www.nexusmods.com/skyrimspecialedition/mods/17765 (its inside its subdirectories) How to ??? ---->> https://github.com/Nukem9/FaceFXWrapper b.) .wav cannot generate workable .lip files. This is because the program will only work on 16Khz mono-channel audio. ffmpeg can do it in. For reference even in dual core CPU from 15 years back it could convert 10,000 files in around 20 minutes. For modern CPU its barely a problem. c.) Then later fuze both as .fuz file and characters will move their mouth with voice. (Don't make final .fuz with 16khz mono channel audio its a lower quality voice) The most famous option for Voice Packs generation is XVASynth. BUT I can bet that it will change soon because of voice cloning can be done in around 1-2 minutes. I created femaleelfhaughty voice clone in around 2 minute using Nvidia 3060. Why even in our community where people have absurdly powerfull PCs we don't have good Voice Packs?? Because XVASynth the most used option for voice generation has problems 1. with its .csv handling. its delimiter is commas `,` its a legit problem 2. Every Voice is generated for every voicetype. So if a mod has line for Ysolda saying - "Wanna buy my skooma?" The said voice line will be generated for every voicetype by XVASynth's logic. And this makes process around 10 times more longer. Its a headache to deal if voice contains around 100,000 voices (this number is actually on lighter side!!) 3. the .lip generation - For newbies who use XVASynth it produces .fuz for every file along the way so the lines per second is not that high (1.2-2.8 even on high end PCs) nor efficient at all and it takes stupid amount of times like 2-3 days of constant pc running. Because it produces the .fuz for every single line individually. What I do is make a unified cache of .lip and use .lip for every voicetype there is so it cuts production time of voicepacks to around 90% (2-3 hours only) - it has problem because no 2 voice types has exact same voice length but people never noticed until now, so they never will until someone spilled the beans. Oops. I actually believe with full faith that in near future people will use newer AI models for voice generation as its so much more expressive its not even a joke. Now lets tackle the elephant in the room the DBVO voice packs IDK why people think its a strength that DBVO picks up dialogue lines as voice lines but its a headache the only benefit it provides I think is if 100 mods have voice line `What?` so we don't have to generate `What_.fuz` 100 times Its a mess - I do not care how many I offend but its a fact. Anybody can extract the voice lines and make voices but the GOD-DAMNED naming is awful it has many limitations 1. first DBVO voice lines cannot have long lines. if your skyrim is installed in Drive://directory/sub-directory/sub-sub directories/TES - Skyrim - Anniversary Edition the DBVO will have a long address to fill the fragile skyrim world and it can, no it WILL crash the game. 2. "the character handling" many mod authors sometimes use fancy characters (i mean « » * < >) in mod - which itself is not a problem perse but DBVO bugs out here. 3. From what i can gather some experienced Mod authors are using normalization to tackle this which is fine but from what I can say after experimenting with it is -> `!` is allowed but `? * < >` is not -> ` ` space is not allowed. -> `()` is allowed -> `[]` is not allowed actually except normal brackets `()` no other parenthesis is allowed. Character will stay silent -> leading spaces ` What is your name? ` `What is your name? ` `What is your name?` `What is your name!?` all 4 are different -> every not allowed character has to be replaced with underscore `_` -> parenthesis are just not consistent. I think even this guide `https://www.nexusmods.com/skyrimspecialedition/articles/12180` is not complete as it misses so many things. Even i don't understand clearly. But its for a fact that some lines will not get voiced no matter what you name them the game is essentially blind to it simple technique I do is just use a script to make DBVO fix and just replace all the damned parenthesis with nothing so game at least sees the text. it does at least voice every single dialogue then "No I will not release that as I cannot maintain such a project. and it will render all existing voicepacks useless." Thank you for taking time out of your day to read this If this helped you in anyway I am glad Feel free to ask anything here. If I can I will answer. If I don't I just could'nt wait a few days. Thanks. Edited 14 hours ago by Azatthot wanted to see colorezied text
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now