Skip to content

Encoding: support wav, flac etc. #630

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Apr 14, 2025
Merged

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Apr 9, 2025

This PR adds support for encoding formats where the encoder doesn't natively support FLTP, which is the sample format of the input waveform. We use swresample to convert the input FLTP AVFrames into a sample format that the encoder support.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 9, 2025
@NicolasHug NicolasHug marked this pull request as ready for review April 9, 2025 12:40
@@ -92,14 +92,13 @@ AudioEncoder::AudioEncoder(
validateSampleRate(*avCodec, sampleRate);
avCodecContext_->sample_rate = sampleRate;

// Note: This is the format of the **input** waveform. This doesn't determine
// the output.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original comment was wrong: that's not the format of the input waveform. It's the format of the input AVFrame that we pass to avcodec_send_frame(). And it needs to be a format that the codec supports.

avCodecContext_->frame_size > 0,
"frame_size is ",
avCodecContext_->frame_size,
". Cannot encode. This should probably never happen?");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't always be non-zero, see below.

Copy link
Contributor

@scotts scotts Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - it might be worth noting why from the docs (https://ffmpeg.org/doxygen/6.0/structAVCodecContext.html#aec57f0d859a6df8b479cd93ca3a44a33, which I admit to not understanding) when we turn 0 into our default.

# Check that decode(encode(samples)) == samples on lossless formats

if get_ffmpeg_major_version() == 4 and output_format == "wav":
pytest.skip("Swresample with FFmpeg 4 doesn't work on wav files")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, we're hitting this:

status = swr_init(swrContext);
TORCH_CHECK(
status == AVSUCCESS,
"Couldn't initialize SwrContext: ",
getFFMPEGErrorStringFromErrorCode(status),
". If the error says 'Invalid argument', it's likely that you are using "
"a buggy FFmpeg version. FFmpeg4 is known to fail here in some "
"valid scenarios. Try to upgrade FFmpeg?");

@NicolasHug NicolasHug merged commit 2c137e7 into pytorch:main Apr 14, 2025
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants