skillby chrislema
sync-feeds
Sync multiple camera feeds by audio cross-correlation for multi-angle video editing
Installs: 0
Used in: 1 repos
Updated: 0mo ago
$
npx ai-builder add skill chrislema/sync-feedsInstalls to .claude/skills/sync-feeds/
## When to use
Use this skill when the user has multiple camera angles of the same recording and needs to sync them before editing. Triggered by `/sync-feeds` or requests like "sync these videos," "align the cameras," or "sync the angles."
## How to use
### Prerequisites
- `ffmpeg` must be installed
- `scipy` and `numpy` must be installed (`pip3 install scipy numpy`)
### Parameters
- **primary**: Path to the primary video file (the one with the quality microphone). This is always the first argument.
- **secondaries**: Paths to 1-2 secondary video files (other camera angles). These follow the primary.
If the user doesn't specify which is primary, ask. The primary is the one with the good audio — it provides the audio track for the final output.
Example invocations:
- `/sync-feeds main_camera.mp4 side_angle.mp4`
- `/sync-feeds primary.mp4 angle2.mp4 angle3.mp4`
### Process
#### Step 1: Extract audio from all feeds
Extract mono 16kHz WAV from each file for cross-correlation. Use PID-unique temp filenames to avoid collisions.
```bash
ffmpeg -y -i <primary> -ac 1 -ar 16000 -f wav /tmp/sync_primary_$PID.wav
ffmpeg -y -i <secondary1> -ac 1 -ar 16000 -f wav /tmp/sync_sec1_$PID.wav
# repeat for secondary2 if present
```
#### Step 2: Get durations of all feeds
```python
import subprocess
def get_duration(path):
result = subprocess.run(
["ffprobe", "-v", "error", "-show_entries", "format=duration",
"-of", "default=noprint_wrappers=1:nokey=1", path],
capture_output=True, text=True
)
return float(result.stdout.strip())
```
#### Step 3: Cross-correlate to find offset
For each secondary, find its time offset relative to the primary's timeline.
```python
import numpy as np
from scipy.io import wavfile
from scipy.signal import fftconvolve
def find_offset(primary_wav_path, secondary_wav_path):
"""Find the offset of secondary relative to primary's timeline.
Returns (offset_seconds, confidence).
offset_seconds: the time in primary's timeline where secondary's t=0 aligns.
Positive = secondary started recording after primary.
Negative = secondary started recording before primary.
confidence: normalized correlation peak (0-1). Above 0.3 is typically a good match.
"""
sr_p, primary_audio = wavfile.read(primary_wav_path)
sr_s, secondary_audio = wavfile.read(secondary_wav_path)
assert sr_p == sr_s, "Sample rates must match"
sample_rate = sr_p
# Use a 30-second chunk from the middle of the secondary
chunk_duration = 30 # seconds
chunk_samples = chunk_duration * sample_rate
mid = len(secondary_audio) // 2
half_chunk = min(chunk_samples // 2, mid)
chunk = secondary_audio[mid - half_chunk : mid + half_chunk].astype(np.float64)
primary_f = primary_audio.astype(np.float64)
# Normalize to [-1, 1]
chunk_max = np.max(np.abs(chunk))
primary_max = np.max(np.abs(primary_f))
if chunk_max > 0:
chunk /= chunk_max
if primary_max > 0:
primary_f /= primary_max
# Cross-correlate using FFT (fast for large arrays)
correlation = fftconvolve(primary_f, chunk[::-1], mode='full')
# Find peak
peak_idx = np.argmax(np.abs(correlation))
peak_value = np.abs(correlation[peak_idx])
# Confidence: normalized correlation
confidence = peak_value / np.sqrt(np.sum(chunk**2) * np.sum(primary_f**2))
# Convert peak position to time offset
# In 'full' mode, at peak_idx the chunk's start aligns at position:
# (peak_idx - len(chunk) + 1) in primary's sample space
chunk_start_in_primary_samples = peak_idx - len(chunk) + 1
chunk_start_in_primary_seconds = chunk_start_in_primary_samples / sample_rate
# The chunk was taken from secondary starting at sample (mid - half_chunk)
chunk_origin_in_secondary = (mid - half_chunk) / sample_rate
# Secondary's t=0 in primary's timeline
offset = chunk_start_in_primary_seconds - chunk_origin_in_secondary
return offset, confidence
```
**Confidence threshold**: If confidence is below 0.15, warn the user that the sync may be unreliable. Below 0.05, stop and ask the user to verify the files contain overlapping audio.
#### Step 4: Calculate common overlap and trim primary only
Calculate the time range where all cameras were recording simultaneously. Trim only the **primary** to this range (stream copy). Secondaries are **not trimmed** — they stay as original files.
```python
def calculate_overlap(primary_duration, secondaries_info):
"""Calculate the common overlap across all feeds.
secondaries_info: list of (offset, duration) tuples for each secondary.
Returns (primary_start, primary_end, overlap_duration).
"""
common_start = 0.0
common_end = primary_duration
for offset, duration in secondaries_info:
common_start = max(common_start, offset)
common_end = min(common_end, offset + duration)
if common_end <= common_start:
raise ValueError("No overlapping time range found — feeds may not be from the same recording")
return common_start, common_end, common_end - common_start
```
Trim the primary to the overlap range:
```bash
# Trim primary only — keep video and audio
ffmpeg -y -ss <common_start> -to <common_end> -i <primary> \
-c copy -avoid_negative_ts make_zero \
<primary_basename>_synced.mp4
```
**Do NOT trim the secondaries.** They stay as original files. The produce-zoom step will extract from them at the correct timestamps during re-encoding, which gives frame-accurate alignment without keyframe snapping issues.
#### Step 5: Write sync manifest
Write a JSON manifest that records everything downstream steps need to find secondary footage for any moment in the synced primary's timeline.
```python
import json
manifest = {
"primary_original": str(primary_path),
"primary_synced": str(synced_primary_path),
"overlap": {
"primary_start": common_start,
"primary_end": common_end,
"duration": overlap_duration
},
"secondaries": []
}
for i, (sec_path, offset, confidence) in enumerate(secondary_results):
manifest["secondaries"].append({
"file": str(sec_path),
"sync_offset": offset,
"confidence": confidence,
"index": i + 1 # secondary_1, secondary_2
})
base_name = os.path.splitext(primary_path)[0]
manifest_path = f"{base_name}_sync_manifest.json"
with open(manifest_path, "w") as f:
json.dump(manifest, f, indent=2)
```
The manifest enables timestamp mapping. For any time `t` in the synced primary's timeline, the corresponding time in a secondary's original file is:
```python
secondary_time = t + manifest["overlap"]["primary_start"] - secondary["sync_offset"]
```
This formula works because:
- `t` is relative to the synced primary (which starts at `overlap.primary_start` in the original primary's timeline)
- `t + primary_start` converts to original primary time
- Subtracting `sync_offset` converts from primary time to secondary time
#### Step 6: Clean up temp files
Delete the extracted WAV files:
```bash
rm -f /tmp/sync_primary_$PID.wav /tmp/sync_sec*_$PID.wav
```
### Output
- `<primary_basename>_synced.mp4` — primary video trimmed to common overlap, audio preserved
- `<primary_basename>_sync_manifest.json` — manifest with offsets and secondary file paths
- Secondary files are **not modified** — they stay as-is in their original location
### Report
Print:
- Offset found for each secondary (in seconds and in frames at the video's fps)
- Correlation confidence for each secondary
- Original duration of each feed
- Common overlap duration
- Time trimmed from the start and end of the primary
- Timestamp mapping formula for verification
- Manifest file path
### Important notes
- **Audio required for sync**: All cameras must have captured audible room audio for cross-correlation to work. If a secondary has no audio track, this method won't work — the user would need to provide the offset manually (e.g., from a clap/slate).
- **Same room, same event**: This assumes all cameras were recording the same audio event in the same room. It will not work for unrelated recordings.
- **30-second correlation window**: Uses a 30-second chunk for speed. For recordings under 60 seconds, the full secondary audio is used instead.
- **Only the primary is trimmed.** The primary is trimmed with stream copy to the overlap range. Keyframe snapping on this trim is fine — the synced primary becomes the reference timeline. All downstream timestamps are relative to it.
- **Secondaries stay untouched.** No trimming, no re-encoding, no stream copy. The produce-zoom step extracts from the original secondary files at frame-accurate timestamps during its re-encoding pass. This avoids keyframe alignment issues entirely.
- **The manifest is the sync data.** Downstream steps (produce-zoom) read the manifest to find secondary files and compute timestamps. The manifest travels with the synced primary through the pipeline.Quick Install
$
npx ai-builder add skill chrislema/sync-feedsDetails
- Type
- skill
- Author
- chrislema
- Slug
- chrislema/sync-feeds
- Created
- 0mo ago