Project Echo - Voice Packs - Deep Learning Voice Synthesis

Executaball · April 15, 2022

4 hours ago, Polonium said:

Holy shit, that's amazing! How long does each sentence take to synthesize?

It's not that long. I'm just working on the model workflow for the most part. Once things are ready it should be at around 8 lines per second, at least on my hardware. It'll be GPU dependent. Synthesis can happen in parallel, because FastPitch is what's known as a non-auto-regressive model.

Executaball · April 15, 2022

On 4/13/2022 at 8:37 PM, sagar123 said:

is this thread dead? wanted to see if we can get "thief" quest updated voice files

As far as I'm aware the English translation for 1.2.2.1 isn't actually finished yet?

sagar123 · April 16, 2022

8 hours ago, Executaball said:

As far as I'm aware the English translation for 1.2.2.1 isn't actually finished yet?

ohhh wait so u need that bme translation file... makes sense now.. thanks for replying... will wait for it and thanks again for the last thief voice pack.. it was pretty good and made overall mod experience amazing!

Dozen9292 · April 17, 2022

On 4/15/2022 at 11:48 AM, Executaball said:

It's not that long. I'm just working on the model workflow for the most part. Once things are ready it should be at around 8 lines per second, at least on my hardware. It'll be GPU dependent. Synthesis can happen in parallel, because FastPitch is what's known as a non-auto-regressive model.

What hardware are you using currently?

Also, are you planning on making your current pipeline available?

Also also, this shit's looking legit. You planning on publishing the research?

Executaball · April 17, 2022

10 minutes ago, Dozen9292 said:

What hardware are you using currently?

Also, are you planning on making your current pipeline available?

Also also, this shit's looking legit. You planning on publishing the research?

Hey, glad you like how it came out. Hardware wise I'm currently using a RTX 3090. I imagine anything made in the last few years would technically work. The newer ampere cards support FP16 and Mixed precision for better training size and lower memory usage. I'm looking at perhaps trying a Tesla A100 for the TensorFloat-32 capabilities, or maybe an A6000 to see how a larger batch size influences things (since that has a 48GB VRAM).

Pipeline wise... it's not like I can't release it, but my current setup consists of random Jupyter Notebooks and scripts running which don't always work and I'm always adjusting with each cycle. I am working with DanRuta on some parts of the model, he's also looking into some new capabilities with AI-based audio quality up-sampling for the near future. If anything seems to become production ready it'll probably end up in some form in xVASynth or xVATrainer. If I do end up with something stable I'll look to release it as a pre-compiled voice model you can use with xVASynth.

I'm also looking to release my neural network based phoneme transcription model as a separate plugin for xVASynth so you can use it as an addon for the existing models. I'm hoping I can get that out in the next week or so. I also have some work in progress for faster multi-threaded .lip synthesis using the FaceFX binaries. As you imagine nothing about the 10 year old Creation Kit 32 bit is thread safe or well designed, so that's a pretty big project on its own. Currently it takes the better part of 7 hours or so to generate .lip files for 200k voice lines.

Dozen9292 · April 17, 2022

45 minutes ago, Executaball said:

I'm also looking to release my neural network based phoneme transcription model as a separate plugin for xVASynth so you can use it as an addon for the existing models. I'm hoping I can get that out in the next week or so. I also have some work in progress for faster multi-threaded .lip synthesis using the FaceFX binaries. As you imagine nothing about the 10 year old Creation Kit 32 bit is thread safe or well designed, so that's a pretty big project on its own. Currently it takes the better part of 7 hours or so to generate .lip files for 200k voice lines.

I've not had speed issues with DanRuta's lip sync plugin, but that sounds great.

I'm really looking forward to whatever you can release. It all sounds sweet.

Executaball · April 17, 2022

3 minutes ago, Dozen9292 said:

I've not had speed issues with DanRuta's lip sync plugin, but that sounds great.

I'm really looking forward to whatever you can release. It all sounds sweet.

Oh is it? Hm. Might be just that I'm processing too fast with the batches? I found it significantly slowed down the actual voice processing. Around how many lines per second were you getting with the plugin on?

VomitMan · April 18, 2022

Looking forward to see all the Gen4 updates, specally ToH! Thanks for being back ❤️

Oblivion_Cat · April 18, 2022

Wow, what an impressive improvement with Gen4

Martok73 · April 18, 2022

Hi, is there any possibility of putting the new Submissive Lola 2..0.51 gen4 voice files onto a different site other than mega? I'm asking because in order to download the 11gb file you have to purchase at minimum a monthly sub. Even using the free megasync app stops the download at 4.9gb, so is there anyway to upload to a different site that doesn't cost money to be able to download it?

Seeker999 · April 18, 2022

3 hours ago, Martok73 said:

Hi, is there any possibility of putting the new Submissive Lola 2..0.51 gen4 voice files onto a different site other than mega? I'm asking because in order to download the 11gb file you have to purchase at minimum a monthly sub. Even using the free megasync app stops the download at 4.9gb, so is there anyway to upload to a different site that doesn't cost money to be able to download it?

You can still use free for downloads, but the pause will happen depending on time of day and traffic. If you leave your connection up, it will continue to download after however many hours and minutes. I don't remember if you have to click a button or not, but you will have to wait and you might have to redownload if something happens the first time.

Executaball · April 18, 2022

7 hours ago, Martok73 said:

Hi, is there any possibility of putting the new Submissive Lola 2..0.51 gen4 voice files onto a different site other than mega? I'm asking because in order to download the 11gb file you have to purchase at minimum a monthly sub. Even using the free megasync app stops the download at 4.9gb, so is there anyway to upload to a different site that doesn't cost money to be able to download it?

I just realized the 192 kbps final sampling rate I used this time might be slightly too high for realistic usage.

I'll resample it at 48kbps (Which is the level Bethesda used for SE voices) and reupload it today.

It should reduce the file size by 4 times.

applesandmayo · April 18, 2022

The file names between the Gen1.3 BSA and ESP for FHU Baka and the ones in the Gen4 are different. Will it be a problem if I rename those files to what they were on the 1.3, just to avoid issues?

bakedpugtatoes · April 18, 2022

Glad you're back, Executaball ? I'm looking forward to an update for The Ancient Profession.

Executaball · April 18, 2022

15 minutes ago, applesandmayo said:

The file names between the Gen1.3 BSA and ESP for FHU Baka and the ones in the Gen4 are different. Will it be a problem if I rename those files to what they were on the 1.3, just to avoid issues?

Hm? What's the name difference actually? I haven't checked. The name for the bsa and esp file doesn't matter for fill her up, it does not interface with the original mod. As long as the bsa and esp are the same name then it's fine.

Martok73 · April 19, 2022

5 hours ago, Executaball said:

I just realized the 192 kbps final sampling rate I used this time might be slightly too high for realistic usage.

I'll resample it at 48kbps (Which is the level Bethesda used for SE voices) and reupload it today.

It should reduce the file size by 4 times.

Awesome thank you

Dozen9292 · April 19, 2022

On 4/17/2022 at 7:52 PM, Executaball said:

Oh is it? Hm. Might be just that I'm processing too fast with the batches? I found it significantly slowed down the actual voice processing. Around how many lines per second were you getting with the plugin on?

Honestly haven't been paying attention to the processing time, so it's quite possible it is slowing it down. I haven't done anything big since I started using the plugin instead of CK (and I've upgraded the non-gpu bits of my rig quite a bit since then). I've mostly been capped by VRAM with my poor 2070S trying its hardest, but a quick test shows me that 8 lines/~2 seconds syncing + lips is about my max.

Here's hoping GPU prices do, in fact, come down in May. Grant money can't come soon enough...

Also I just noticed you said you're considering using an A100. Are/would you be using something like google compute? Or are you lucky enough to have another way of getting access?

applesandmayo · April 19, 2022

2 hours ago, Executaball said:

Hm? What's the name difference actually? I haven't checked. The name for the bsa and esp file doesn't matter for fill her up, it does not interface with the original mod. As long as the bsa and esp are the same name then it's fine.

Gen1.3, both are Voices1 for the bsa and esp, and Gen4 is Voice1 for both, so just a plurality thing.

Executaball · April 19, 2022

1 hour ago, Dozen9292 said:

Honestly haven't been paying attention to the processing time, so it's quite possible it is slowing it down. I haven't done anything big since I started using the plugin instead of CK (and I've upgraded the non-gpu bits of my rig quite a bit since then). I've mostly been capped by VRAM with my poor 2070S trying its hardest, but a quick test shows me that 8 lines/~2 seconds syncing + lips is about my max.

Here's hoping GPU prices do, in fact, come down in May. Grant money can't come soon enough...

Also I just noticed you said you're considering using an A100. Are/would you be using something like google compute? Or are you lucky enough to have another way of getting access?

Google compute is kind of unreliable. Even if you pay they don't guarantee the specific GPU you will get. It's all random. Paying more just gives you a better chance of having high tier resources assigned. If you want guaranteed resources it'll be something like https://lambdalabs.com/service/gpu-cloud. Though that's at anywhere from 1.45$ to 10$ per hour.

Executaball · April 19, 2022

42 minutes ago, applesandmayo said:

Gen1.3, both are Voices1 for the bsa and esp, and Gen4 is Voice1 for both, so just a plurality thing.

Yeah it's fine, just load the new one only and remove the old one. The file name shouldn't matter as long as the esp and bsa have the same name.

Dozen9292 · April 19, 2022

7 minutes ago, Executaball said:

Google compute is kind of unreliable. Even if you pay they don't guarantee the specific GPU you will get. It's all random. Paying more just gives you a better chance of having high tier resources assigned. If you want guaranteed resources it'll be something like https://lambdalabs.com/service/gpu-cloud. Though that's at anywhere from 1.45$ to 10$ per hour.

Yup, that'd be why I said something like. I've been burned by GC, wanting to use it for research and getting the crummy systems only for days.

How long are your models taking to train?

(I'm assuming, of course, that once trained your models slot right in to xvasynth.)

Executaball · April 19, 2022

Alright the new version of Submissive Lola voices is live on the page, using the same link. Size is reduced from 12GB to 3.46GB

Executaball · April 19, 2022

1 hour ago, jperrins66 said:

The lola LE link is pointing to the SE file.

That's weird. The link must have not updated. Fixed, thanks for letting me know.

rusty2 · April 19, 2022

This mod looks amazing I appreciate all the hard work. I do have a question tho. How do you install these voice is ot as simple as installing the a mod manager or do you have to manually install them.

Executaball · April 19, 2022

6 hours ago, rusty2 said:

This mod looks amazing I appreciate all the hard work. I do have a question tho. How do you install these voice is ot as simple as installing the a mod manager or do you have to manually install them.

You can use any mod manager or just directly install them. They come with one or more .bsa and .esp's. It's just like how you installed the original mod.

Sign In

Project Echo - Voice Packs - Deep Learning Voice Synthesis

Recommended Posts

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Link to comment

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Important Information