Cross Domain Optimization for Speech Enhancement: Parallel or Cascade?

1. Performance comparison at each stage in parallel 2(Att).

SNR = -6

    Noisy(at 16kHz)       TF module 1        Time module

    TF module 2       Clean(Reference)

SNR = -3

    Noisy(at 16kHz)       TF module 1        Time module

    TF module 2       Clean(Reference)

SNR = 0

    Noisy(at 16kHz)       TF module 1        Time module

    TF module 2       Clean(Reference)

SNR = 3

    Noisy(at 16kHz)       TF module 1        Time module

    TF module 2       Clean(Reference)

SNR = 6

    Noisy(at 16kHz)       TF module 1        Time module

    TF module 2       Clean(Reference)

2. Performance comparison of different models

In this part, we will present the enhanced speech of our proposed model compared to other models.

SNR = -6

    Noisy(at 16kHz)       DCCRN            DPRNN

    FullSubNet         Cascade 3         Parallel 2(Att)

SNR = -3

    Noisy(at 16kHz)       DCCRN            DPRNN

    FullSubNet         Cascade 3         Parallel 2(Att)

SNR = 0

    Noisy(at 16kHz)       DCCRN            DPRNN

    FullSubNet         Cascade 3         Parallel 2(Att)

SNR = 3

    Noisy(at 16kHz)       DCCRN            DPRNN

    FullSubNet         Cascade 3         Parallel 2(Att)

SNR = 6

    Noisy(at 16kHz)       DCCRN            DPRNN

    FullSubNet         Cascade 3         Parallel 2(Att)