Stutter when video is played its full length

Hello Shotstack Community. Hoping someone can help me understand and perhaps solve the following problem.

I have a video that is 11.01 seconds in length:

If I play it in Shotstack Studio for a length of 11.01 seconds, it stutters at the end. If I shave off a tenth of a second on the video length and play it for 10.91 seconds, no stutter.

Does anyone know why it stutters when played for its actual length? Below is JSON for both scenarios:

STUTTER

{
  "timeline": {
    "background": "#000000",
    "tracks": [
      {
        "clips": [
          {
            "asset": {
              "type": "video",
              "src": "https://peersuma-store.s3.amazonaws.com/project_64fdc30ae33ea60bd3f58231/input/working/923f3759-539f-468d-8cf0-99cacf93954d_20230911_193147.mp4",
              "volume": 1,
                "trim": 0
            },
            "start": 0,
            "length": 11.01
          }
        ]
      }
    ]
  },
  "output": {
    "format": "mp4",
    "size": {
      "width": 1920,
      "height": 1080
    },
    "fps": 30,
    "destinations": [
      {
        "provider": "shotstack",
        "exclude": true
      }
    ]
  }
}

NO STUTTER

{
  "timeline": {
    "background": "#000000",
    "tracks": [
      {
        "clips": [
          {
            "asset": {
              "type": "video",
              "src": "https://peersuma-store.s3.amazonaws.com/project_64fdc30ae33ea60bd3f58231/input/working/923f3759-539f-468d-8cf0-99cacf93954d_20230911_193147.mp4",
              "volume": 1,
                "trim": 0
            },
            "start": 0,
            "length": 10.91
          }
        ]
      }
    ]
  },
  "output": {
    "format": "mp4",
    "size": {
      "width": 1920,
      "height": 1080
    },
    "fps": 30,
    "destinations": [
      {
        "provider": "shotstack",
        "exclude": true
      }
    ]
  }
}

This is one of the issues that occurs with UGC content recorded on mobile devices. They often use something called variable frame rate which is used to compress the video and audio.

The problem is that while browsers are able to adjust for the variable video and audio frame rates, right now our video editor can not make these adjustments during editing.

If you inspect the source video using out probe endpoint you will get the following output:

https://api.shotstack.io/v1/probe/https%3A%2F%2Fpeersuma-store.s3.amazonaws.com%2Fproject_64fdc30ae33ea60bd3f58231%2Finput%2Fworking%2F923f3759-539f-468d-8cf0-99cacf93954d_20230911_193147.mp4

{
    "success": true,
    "message": "ok",
    "response": {
        "metadata": {
            "streams": [
                {
                    "index": 0,
                    "codec_name": "aac",
                    "codec_long_name": "AAC (Advanced Audio Coding)",
                    "profile": "LC",
                    "codec_type": "audio",
                    "codec_tag_string": "mp4a",
                    "codec_tag": "0x6134706d",
                    "sample_fmt": "fltp",
                    "sample_rate": "48000",
                    "channels": 2,
                    "channel_layout": "stereo",
                    "bits_per_sample": 0,
                    "r_frame_rate": "0/0",
                    "avg_frame_rate": "0/0",
                    "time_base": "1/48000",
                    "start_pts": 0,
                    "start_time": "0.000000",
                    "duration_ts": 528384,
                    "duration": "11.008000",
                    "bit_rate": "128236",
                    "nb_frames": "518",
                    "disposition": {
                        "default": 1,
                        "dub": 0,
                        "original": 0,
                        "comment": 0,
                        "lyrics": 0,
                        "karaoke": 0,
                        "forced": 0,
                        "hearing_impaired": 0,
                        "visual_impaired": 0,
                        "clean_effects": 0,
                        "attached_pic": 0,
                        "timed_thumbnails": 0
                    },
                    "tags": {
                        "language": "eng",
                        "handler_name": "SoundHandler",
                        "vendor_id": "[0][0][0][0]"
                    }
                },
                {
                    "index": 1,
                    "codec_name": "h264",
                    "codec_long_name": "H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10",
                    "profile": "Main",
                    "codec_type": "video",
                    "codec_tag_string": "avc1",
                    "codec_tag": "0x31637661",
                    "width": 1920,
                    "height": 1080,
                    "coded_width": 1920,
                    "coded_height": 1080,
                    "closed_captions": 0,
                    "has_b_frames": 2,
                    "sample_aspect_ratio": "1:1",
                    "display_aspect_ratio": "16:9",
                    "pix_fmt": "yuv420p",
                    "level": 30,
                    "color_range": "tv",
                    "color_space": "bt709",
                    "color_transfer": "bt709",
                    "color_primaries": "bt709",
                    "chroma_location": "left",
                    "refs": 1,
                    "is_avc": "true",
                    "nal_length_size": "4",
                    "r_frame_rate": "30000/1001",
                    "avg_frame_rate": "30000/1001",
                    "time_base": "1/30000",
                    "start_pts": 0,
                    "start_time": "0.000000",
                    "duration_ts": 327327,
                    "duration": "10.910900",
                    "bit_rate": "4150710",
                    "bits_per_raw_sample": "8",
                    "nb_frames": "327",
                    "disposition": {
                        "default": 1,
                        "dub": 0,
                        "original": 0,
                        "comment": 0,
                        "lyrics": 0,
                        "karaoke": 0,
                        "forced": 0,
                        "hearing_impaired": 0,
                        "visual_impaired": 0,
                        "clean_effects": 0,
                        "attached_pic": 0,
                        "timed_thumbnails": 0
                    },
                    "tags": {
                        "language": "eng",
                        "handler_name": "VideoHandler",
                        "vendor_id": "[0][0][0][0]"
                    }
                }
            ],
            "chapters": [],
            "format": {
                "filename": "https://peersuma-store.s3.amazonaws.com/project_64fdc30ae33ea60bd3f58231/input/working/923f3759-539f-468d-8cf0-99cacf93954d_20230911_193147.mp4",
                "nb_streams": 2,
                "nb_programs": 0,
                "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
                "format_long_name": "QuickTime / MOV",
                "start_time": "0.000000",
                "duration": "11.051000",
                "size": "5851097",
                "bit_rate": "4235705",
                "probe_score": 100,
                "tags": {
                    "major_brand": "isom",
                    "minor_version": "512",
                    "compatible_brands": "isomiso2avc1mp41",
                    "encoder": "Lavf57.71.100"
                }
            }
        }
    }
}

You can see that the audio stream has a duration of 11.008, the video stream has a duration of 10.91 and the container (both video and audio) has a duration of 11.05.

So everything is slightly out of sync. By setting the length to 10.91 you use the video duration which cuts the sound off at the end. When it is 11.01 it is repeating the last sample of the audio a few extra times for some reason (a bug).

Will investigate what we can do via our backend and the duration we return when you upload a file. Perhaps if we use the video stream duration this would not happen.

This explains why the issue is occurring but it will require further investigation from our end.

Lucas, thanks for the reply and investigation. This video was taken by me using a Canon XA30 camcorder, not a mobile phone. It’s possible my camcorder is set to frame rate of 29.97 and I’m using 30fps in Shotstack. If you think this might be a possible cause, let me know. Otherwise, hopefully some other fix will present itself from your end, or i will shave a tenth of a second off videos in cases like these, not the end of the world.

I did check the framerates and yes, the camera footage is 30000/1001 (r_frame_rate), so 29.97 but when I rendered the video at 29.97 there was a black section at the end. Perhaps the video ends but the audio is still playing. I still think it is because the audio and video are out of sync. Will keep investigating and hopefully we can optimise and improve this.