Project Echo - Voice Packs - Deep Learning Voice Synthesis

aspirine2 · January 21, 2022

hey is there an easy way to create empty esp for MME !? many thanks in advance

TrollAutokill · January 21, 2022

4 hours ago, QuantumWraith said:

@Executaball Hi, I'm generating some voice packs for some of the mods on my modlist and got curious to check out the voice files you included for Dovahkiin's Infamy. I noticed there is a folder for maleuniqueghost. I can't seem to find this voice model anywhere; it's not listed on XVASynth's modpage. Did you train it yourself and are you able to share it?

@ExecutaballIf there is a way to train voices ourselves, I am interested to know! There are a few voices followers who could use the treatment.

BigOnes69 · January 21, 2022

4 hours ago, TrollAutokill said:

@ExecutaballIf there is a way to train voices ourselves, I am interested to know! There are a few voices followers who could use the treatment.

Yeah just tie them to a post and beat them till u get the right pitch. LOL.

DevilFireSmile · January 21, 2022

This mod is AMAZING! Incredibly immersive.

*Leoosp* · January 22, 2022

The voice pack for Simple Slavery++ needs an update as the modification it is for is now at version 6.3.14, so can anyone do this pretty please?

The voice pack though currently is for version 6.3.12 of Simple Slavery++, there's been dialogue additions and alterations.

Edited January 22, 2022 by Leoosp

masterchief24 · February 13, 2022

Quote

14. Thief BSA Voice Pack V1.0.0.8-BME

It looks like the SE Split Version 1 zip files are corrupted.

Spoiler

[SE] (Gen 2.1) Thief v1.0.8-BME EVP v1.z01: Dogma - Thief_Voices3.bsa - Cannot read archive data
An error occurred

Edited February 13, 2022 by masterchief24

Seeker999 · February 13, 2022

22 hours ago, masterchief24 said:

t looks like the SE Split Version 1 zip files are corrupted.

I could download and open Part 0, but Part 1 was big enough that ~~I'd have~~ they want me to pay and I'm not doing that. I just wanted to test it for myself to see if I can open the archive. It's currently downloading, we'll see how it goes.

I will say that in the past when I got errors downloading, sometimes just going to a different browser worked.

Update: I was able to download Part 1. Once I changed the file extension from .z01 to .zip I was able to open it.

Edited February 14, 2022 by Seeker999
update

devildx · February 19, 2022

Still waiting for the voice pack for Sexy Bandit Captives SSE v0.971 AIO.

If anyone want to make the voice pack but don't have the mod send me a pm.

VAAlucard · February 20, 2022

So, i've seen where people have asked about xVAsynth for Apropos. I'd like to see this happen as well but as stated a few pages back, Apropos doesn't use the skyrim native dialogue to show the text and xVASynth I believe requires that.

I do however wonder, since there is a patch for Fuz Ro Bork that reads books for you, if that may somehow be reverse engineered to find a way patch apropos into fuz ro bork. If that would be enough to get xVASynth to play audio for Apropos.

Maybe a script for apropos that hooks into the Audio Book version of Fuz Ro Bork. The messages would unfortunately have to load as a bookUI which may be annoying for the sake of immersion. this however, would make it ?incompatible? with any mod that allows skyrim to proceed in the background with a menu or book open.

This isn't intended as a request, more just a rambling of an idea that may be a viable workaround, however this is coming from someone with a rudimentary understanding of making mods.

PixnMinx · April 4, 2022

This looks absolutely amazing. Great work! Could I please put in a vote for this to be done for Maria Eden SE? It would take the immersion of that mod to the next level ?

sofzx · April 8, 2022

Thank you so much for your work! it's really so fun for immersion lol

TrollAutokill · April 8, 2022

On 2/19/2022 at 11:21 PM, devildx said:

Still waiting for the voice pack for Sexy Bandit Captives SSE v0.971 AIO.

If anyone want to make the voice pack but don't have the mod send me a pm.

I have Sexy_Bandit_Captives_SSE_v0.971_D_AIO_Plus_MCM

I guess it should do, I will see what I can do.

EDIT: There is a new version 0.98d it seems SBC was recently revived!

Edited April 8, 2022 by TrollAutokill

devildx · April 8, 2022

1 hour ago, TrollAutokill said:

I have Sexy_Bandit_Captives_SSE_v0.971_D_AIO_Plus_MCM

I guess it should do, I will see what I can do.

EDIT: There is a new version 0.98d it seems SBC was recently revived!

No need anymore. Already did for myself. I play with Skyfem mod so my version have only female voices including female voices for the male dialogs. You can just delete the male folders if you want only the female voices.

https://www.loverslab.com/topic/111415-skyfem-all-npcs-now-female-special-edition/page/31/#comment-3696506

Great to see that the mod is up again.

EDIT: Seems like the new version is voiced so only use my file for the specific version 0.971 AIO.

Edited April 8, 2022 by devildx

TrollAutokill · April 8, 2022

1 minute ago, devildx said:

No need anymore. Already did for myself. I play with Skyfem mod so my version have only female voices including female voices for the male dialogs. You can just delete the male folders if you want only the female voices.

https://www.loverslab.com/topic/111415-skyfem-all-npcs-now-female-special-edition/page/31/#comment-3696506

Great. Is that 0.98 or 0.97? Maybe it doesn't make any difference.

devildx · April 8, 2022

38 minutes ago, TrollAutokill said:

Great. Is that 0.98 or 0.97? Maybe it doesn't make any difference.

It is for the old version 0.971. The new 0.98 seems to be a complete new mod and in the description it says that is already voiced.

I will test the new version when I start a new game. Just waiting for some other mods to update before I do.

Jasmine92 · April 8, 2022

On 1/21/2022 at 7:54 AM, TrollAutokill said:

@ExecutaballIf there is a way to train voices ourselves, I am interested to know! There are a few voices followers who could use the treatment.

I don't know if you are still interested, but Dan Ruta (creater ov xvasynth) released XVATrainer to train your own voices. https://www.nexusmods.com/skyrimspecialedition/mods/65022/

sofzx · April 9, 2022

Fill Her Up got an update. Should it work?

sagar123 · April 14, 2022

is this thread dead? wanted to see if we can get "thief" quest updated voice files

Executaball · April 14, 2022

Hey all. Sorry for the extended absence but I have been working on the voice generation workflow in the meantime. For the past months I have been working with some other researchers on an end-to-end model for phoneme and pitch-energy modulation to compliment the current FastPitch model by Nvidia. Quality is improved by a notable margin and the biggest current issue with current AI generated voices (awkward pronunciations) is fixed for the most part.

It turns out English is actually quite hard to pronounce, some words have multiple pronunciations despite having the same spelling, when used in different contexts. We have over 950 words in English where the pronunciation or intonation changes based on context or usage of the word.

So in the current model I've employed separate end-to-end neural network models, one for pronunciation and deducing pronunciation, the new Fast Pitch models, and another for energy modulation for emotion. This architecture is partly based on Fast-Speech and the proposed RAD-TTS structure which was released last year by Nvidia at Interspeech.

FastSpeech: New text-to-speech model improves on speed, accuracy, and controllability - Microsoft Research

Additionally there is now a residual convolutional neural network added for lexical stress detection. This is where, in human speech, we would put stress on certain syllables of words depending on the sentence structure:

image.png.0c9e78fba6dd5c67a479456e288cbe95.png

It is now possible for neural network based voice models to know the context of the sentence and use that in predicting how to pronounce a word. Here's of what we currently have to deal with:

Here is an example of a sentence using our new model pipeline with feed-forward neural network based contextual recognition:

I can't really demonstrate the energy differences but you can try some of the examples below to get a feel of the minor improvements in the model's expression of emotion and sentence stress. I still have some stuff to fix, with neural networks nothing is really constant and there will still be issues moving forward, but... If all goes well I might start going over the current voice packs and updating them to the new types and mod versions in the next few weeks.

Here are some example comparisons in the meantime between the current pipeline and my new one.

And here are just some more exported voice lines using Submissive Lola as an example. Remember these are all part of the model itself, no individual tweaking or adjustments can be done, which is important since covering voices for LoversLab type mods can go up to hundreds of thousands of lines easily:

Edited April 14, 2022 by Executaball

keitsoru · April 15, 2022

On 4/8/2022 at 8:23 AM, devildx said:

No need anymore. Already did for myself. I play with Skyfem mod so my version have only female voices including female voices for the male dialogs. You can just delete the male folders if you want only the female voices.

https://www.loverslab.com/topic/111415-skyfem-all-npcs-now-female-special-edition/page/31/#comment-3696506

Great to see that the mod is up again.

EDIT: Seems like the new version is voiced so only use my file for the specific version 0.971 AIO.

Hey, just curious, but you wouldn't happen to have genderswapped voices for vanilla dialogue, would you?

Polonium · April 15, 2022

11 hours ago, Executaball said:

Spoiler

Hey all. Sorry for the extended absence but I have been working on the voice generation workflow in the meantime. For the past months I have been working with some other researchers on an end-to-end model for phoneme and pitch-energy modulation to compliment the current FastPitch model by Nvidia. Quality is improved by a notable margin and the biggest current issue with current AI generated voices (awkward pronunciations) is fixed for the most part.

It turns out English is actually quite hard to pronounce, some words have multiple pronunciations despite having the same spelling, when used in different contexts. We have over 950 words in English where the pronunciation or intonation changes based on context or usage of the word.

So in the current model I've employed separate end-to-end neural network models, one for pronunciation and deducing pronunciation, the new Fast Pitch models, and another for energy modulation for emotion. This architecture is partly based on Fast-Speech and the proposed RAD-TTS structure which was released last year by Nvidia at Interspeech.

Additionally there is now a residual convolutional neural network added for lexical stress detection. This is where, in human speech, we would put stress on certain syllables of words depending on the sentence structure:

It is now possible for neural network based voice models to know the context of the sentence and use that in predicting how to pronounce a word. Here's of what we currently have to deal with:

Here is an example of a sentence using our new model pipeline with feed-forward neural network based contextual recognition:

I can't really demonstrate the energy differences but you can try some of the examples below to get a feel of the minor improvements in the model's expression of emotion and sentence stress. I still have some stuff to fix, with neural networks nothing is really constant and there will still be issues moving forward, but... If all goes well I might start going over the current voice packs and updating them to the new types and mod versions in the next few weeks.

Here are some example comparisons in the meantime between the current pipeline and my new one.

And here are just some more exported voice lines using Submissive Lola as an example. Remember these are all part of the model itself, no individual tweaking or adjustments can be done, which is important since covering voices for LoversLab type mods can go up to hundreds of thousands of lines easily:

Holy shit, that's amazing! How long does each sentence take to synthesize?

devildx · April 15, 2022

2 hours ago, keitsoru said:

Hey, just curious, but you wouldn't happen to have genderswapped voices for vanilla dialogue, would you?

Just look in the previous page.

https://www.loverslab.com/topic/111415-skyfem-all-npcs-now-female-special-edition/page/30/#comment-3695629

Executaball · April 15, 2022

4 hours ago, Polonium said:

Holy shit, that's amazing! How long does each sentence take to synthesize?

It's not that long. I'm just working on the model workflow for the most part. Once things are ready it should be at around 8 lines per second, at least on my hardware. It'll be GPU dependent. Synthesis can happen in parallel, because FastPitch is what's known as a non-auto-regressive model.

Executaball · April 15, 2022

On 4/13/2022 at 8:37 PM, sagar123 said:

is this thread dead? wanted to see if we can get "thief" quest updated voice files

As far as I'm aware the English translation for 1.2.2.1 isn't actually finished yet?

sagar123 · April 16, 2022

8 hours ago, Executaball said:

As far as I'm aware the English translation for 1.2.2.1 isn't actually finished yet?

ohhh wait so u need that bme translation file... makes sense now.. thanks for replying... will wait for it and thanks again for the last thief voice pack.. it was pretty good and made overall mod experience amazing!

Sign In

Project Echo - Voice Packs - Deep Learning Voice Synthesis

Recommended Posts

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members