Search a TB-scale video library — locally, in seconds.

Windows 100% local video processing Free trial · 500 videos, never expires

Once your archive is indexed, ClipCatalog combines tag search, spoken-word search, natural-language description search, face filters, and metadata filters into a single query path across the whole library. Drop thousands of clips down to dozens in a few clicks.

No per-minute pricing. No subscriptions. A single $99 license to make every video you own searchable across every drive.

Download free trial See pricing →

Walkthrough of the ClipCatalog search interface — switching between visual, transcript, person, and metadata filters across an indexed local video library.

The "I know it's in there somewhere" problem

Past a few terabytes, even good folder structure stops being enough. You know the shot exists, but not on which drive, in which trip folder, in which interview, in which take. Without library-wide search, you scroll. With ClipCatalog, you query — and the right clip surfaces in seconds.

Without library-wide search

You know the shot exists, but not on which drive or in which year's folder
Cloud SaaS pricing scales with library size — exactly when you need search the most
Single-purpose tools index either tags, or transcripts, or faces — not all of them together

With ClipCatalog

One search bar covers every folder, every drive, and every year of footage
Combine a tag, a spoken word, a face, and a date range in one query
Saved presets turn recurring multi-filter queries into one-click recalls

How library-wide video search works

Three things have to be true at TB scale: every clip must be reachable from one search, every retrieval mode must run against the same index, and the index must survive drive reorganizations. ClipCatalog handles all three locally.

Video search →

Point at every folder

Add internal drives, external SSDs, NAS shares, and archive disks. ClipCatalog tracks each one by storage volume so the catalog survives drive moves, renames, and detach/reattach cycles. Subfolders can be excluded when only part of a directory should be indexed.

Local AI builds the index

Thumbnail extraction, AI visual tagging, Whisper transcription, face detection, and natural-language embeddings all run on your hardware — DirectML or Vulkan when available, CPU fallback otherwise. The first pass is a one-time cost; after that, every search hits a local index.

Search by stacking filters

Combine tag, transcript, face, footage-type, and metadata filters in a single query. Each filter trims the result set further — a 10,000-clip library narrows to a handful in seconds. Save any combination you'll run again as a preset; the next search is one click away.

Library-wide searches that finally become easy

Six searches that demonstrate what library-wide retrieval looks like in practice — from a single visual tag to a saved preset that combines four filters at once. Each example assumes the index is already built; from there, the query is the easy part.

beach + sunset — two visual tags, Match-All — across every drive you own (tag search)

conclusion in your tutorial archive — every wrap-up moment across years of recordings (single-word transcript search)

drone shot over a foggy valley at sunrise — describe what you want in plain language (natural-language search, Balanced strictness)

a specific face + interview tag + 2023 date range — every interview clip with one named person, in one season (face filter + tag search + metadata)

scenic footage-type + 4K resolution + drone — every wide-angle scenic 4K shot, no dialogue, ready for the next opening sequence (footage-type + technical filters)

Saved preset: "client A — interviews 2024" — face filter + dialogue-heavy footage-type + tag interview + 2024 date range, one click (preset combining four modes)

Who searches a TB-scale video library?

Six shapes of real archives where library-wide search pays off the moment you stop scrolling and start querying.

Broadcast news archive with decade-spanning indexed content

Roughly 40 TB, 80 000 clips, four newsroom edit suites pulling material every day. When a story breaks, the question is "what do we already have on this person, this place, this incident?" A face filter + a single-word transcript search + a date-range filter answers it across ten years of coverage in seconds.

Stock-footage library with rights-cleared catalog spanning decades

Hundreds of thousands of clips, every one tagged for what's on screen. Buyers come with very specific briefs ("slow-motion aerial of a coastline at golden hour"). A natural-language description search at Strict strictness plus a scenic footage-type filter plus a drone filter narrows hundreds of thousands of clips to a license-able shortlist in one query.

Legal e-discovery across deposition and body-cam footage

Twenty-five terabytes of recordings across discovery: depositions, body-cam pulls, neighbourhood-camera dumps. Opposing counsel asks a question; the team needs every clip mentioning a specific term — fast, and without uploading sealed material to a third party. Single-word transcript search across the lot, then narrow by date range and speaker face.

Sports analytics across multi-season game and training libraries

Five seasons of full-game multi-camera captures plus daily training footage adds up to tens of terabytes. The analyst needs "every counter-attack started by player X in matches we lost" — a face filter + a tag-search for the action + a metadata filter for losses narrows years of tape to a focused study reel in one query.

University media lab and scientific imaging archive spanning a decade

Eighty terabytes of experiment recordings: behavioural trials, microscope captures, fieldwork, time-lapse rigs. The next paper needs comparable footage from older studies — a natural-language description search at Balanced strictness plus a date-range filter surfaces every comparable run across the decade, without a graduate student spending two weeks digging through external drives.

Post-production archival vault for completed client work

Eight years of finished projects archived for the contractually required retention window — well past 100 TB. The producer comes back asking for "the lifestyle reel from the spring 2024 cosmetics campaign": a saved preset combining client tag + project date range + dialogue-free footage-type returns the right reel in seconds, even though most of the archive lives on cold drives.

What to expect at TB scale

Large libraries put pressure on every step of an indexing pipeline. ClipCatalog is built around the trade-offs that actually matter at TB scale — here's the honest version.

The first index is the biggest cost

Plan the first pass overnight or across a weekend on a multi-TB library. After it completes, incremental syncs only process the new content. Pause and resume as needed — progress is preserved.

Hardware controls put speed in your hands

Pick the GPU used for content analysis, the GPU used for Whisper transcription, and the device used for face detection independently. Run the built-in benchmark and ClipCatalog suggests the fastest setup for your machine.

Latency stays workable past 100 000 clips

The index is built on a local vector database and an encrypted SQLite catalog tuned for fast lookups. Expect a single-tag or single-word query to return its first results in seconds on a 20 000-clip library, and to stay responsive as the library doubles.

Combined filters keep result lists workable

A 10,000-clip library doesn't have to give you 10,000 results. Stack tag, transcript, face, footage-type, and metadata filters together — each one trims the set further, so the handful you actually want surfaces in seconds.

Four retrieval modes, one library

Tag search uses an enumerable AI vocabulary; spoken-word search lets you find single words in transcripts with Match-All/Match-Any; natural-language search ranks clips by free-text descriptions with Relaxed/Balanced/Strict strictness; face search filters by people clustered from detected faces. Use whichever mode fits the question.

Windows only for now

ClipCatalog runs on Windows 10 and 11. The app picks the fastest hardware path it can — DirectML, Vulkan, or CPU — but Mac and Linux builds are not on the near-term roadmap.

What "TB scale" actually looks like

Concrete order-of-magnitude figures on representative hardware (a recent Vulkan-class GPU, NVMe-backed catalog drive). Your numbers will shift with codec mix, model size, and storage speed — these are the shape to plan around, not a hard SLA.

≈20 000 clips

Representative test library

≈24 TB of source video, ≈3 000 hours, spread across three internal drives and one NAS share. The figures below are an order-of-magnitude planning model for a library of that shape — your real numbers will shift with codec mix, hardware, and the AI stages you enable.

Seconds, not minutes

First result latency

A single-tag query against the full 20 000-clip index returns first results in 2-5 seconds on the representative rig. Stacked filters add a fraction of a second per layer; the slowest part is usually the natural-language vector lookup.

Settings → Storage Usage

Catalog footprint on disk

Thumbnails, AI tags, transcripts, face data, and semantic vectors all add storage on top of your source video. The Settings → Storage Usage screen breaks the live total down by category in MB — measure your actual footprint after the first index pass rather than working from an estimate.

Hours to days

First-pass indexing time

Plan multi-TB libraries around overnight or weekend background work for the first pass — indexing time depends heavily on codec mix, GPU class, and how many AI stages you enable. The pipeline is resume-friendly, so it survives reboots and drive swaps.

Minutes per new clip

Incremental sync cost

After the first pass, only new or changed clips run through the AI pipeline. The existing index is never rebuilt. A daily sync on a working library typically reaches "caught up" within minutes, even when the underlying archive sits at TB scale.

The economics of searching a 10 TB archive

Cloud transcription services typically bill in the range of $0.005-$0.024 per minute of audio. Across a multi-TB archive that's hundreds of hours of source, transcribing even once runs into the high three- to four-figure range — and that's before storage tier fees, per-user search seats, or egress (cloud egress runs $0.05-$0.09 per GB, which is another $500-$900 if you ever want a 10 TB copy back). The bill grows with the library; the math gets worse the longer you stay.

ClipCatalog ingests on the drives you already own, indexes locally with the GPU you already have, and stores everything in an encrypted SQLite database on your machine. $99 one-time. No bandwidth, no egress, no recurring subscription. A 50 TB library and a 5 TB library cost the same to license; the dominant cost is the storage hardware you already own.

Comparing local-first tools for large archives? See the privacy-first video management roundup for how ClipCatalog stacks up on TB-scale ingestion, library-wide search, and offline workflows.

Searching a large video library — FAQ

How large a library can ClipCatalog handle?

It is designed for multi-terabyte archives spread across multiple folders, drives, and volumes. The catalog database, vector index, and queue are built to scale; the practical ceiling on a modern PC is mostly storage space for thumbnails and embeddings, not a hard video count.

How long does the first indexing pass take on a TB-scale library?

It depends heavily on hardware, codec mix, and how many AI stages you enable. A multi-TB library typically benefits from running the first pass overnight or across a weekend on a GPU-equipped PC. Indexing is fully resumable, so you can pause and resume.

Will queries stay fast as the library grows?

Yes — that's the part the architecture is designed around. Tag and transcript lookups use indexed columns in the local SQLite catalog; semantic and face queries use a local vector index. Both are designed to stay fast as the library grows, so a 50 000-clip library is only marginally slower than a 5 000-clip one for the same query shape.

Can I save filter combinations so I don't have to rebuild them?

Yes — saved search presets are a first-class feature. Bundle a tag-search, a transcript-word filter, a face filter, and a date range into one named preset, then run it again with a single click. Presets stay sensible even when the library grows over time.

Can I really combine tag, transcript, face, and metadata filters in one search?

Yes. Filters from the Visual, Audio, Metadata, Location, and Technical categories all apply to the same result list. Each one shaves off matches that don't meet that condition, so layered filters cut a five-figure result list down to a workable handful very quickly.

Features that pay off most as the library scales

Three capabilities matter disproportionately at scale: saved presets that turn recurring queries into one click, footage-type classification that separates dialogue from voiceover from scenic before you even apply other filters, and a unified search bar that stacks tag, transcript, face, and metadata into a single query.

External drives GPU acceleration Local-first privacy

Relevant comparisons

If you are evaluating this workflow against other tools, start with these side-by-side pages.

ClipCatalog vs Axle AI

Open comparison →

ClipCatalog vs Peakto

Open comparison →

ClipCatalog vs Daminion

Open comparison →

Try ClipCatalog free — up to 500 videos

No account required. Your footage stays on your computer.

Download free trial Buy now — $99

500 videos free No credit card · no account 100% local — footage never leaves your PC

Search a TB-scale video library — locally, in seconds.

The "I know it's in there somewhere" problem

Without library-wide search

With ClipCatalog

How library-wide video search works

Point at every folder

Local AI builds the index

Search by stacking filters

Library-wide searches that finally become easy

Who searches a TB-scale video library?

Broadcast news archive with decade-spanning indexed content

Stock-footage library with rights-cleared catalog spanning decades

Legal e-discovery across deposition and body-cam footage

Sports analytics across multi-season game and training libraries

University media lab and scientific imaging archive spanning a decade

Post-production archival vault for completed client work

What to expect at TB scale

The first index is the biggest cost

Hardware controls put speed in your hands

Latency stays workable past 100 000 clips

Combined filters keep result lists workable

Four retrieval modes, one library

Windows only for now

What "TB scale" actually looks like

Representative test library

First result latency

Catalog footprint on disk

First-pass indexing time

Incremental sync cost

The economics of searching a 10 TB archive

Searching a large video library — FAQ

How large a library can ClipCatalog handle?

How long does the first indexing pass take on a TB-scale library?

Will queries stay fast as the library grows?

Can I save filter combinations so I don't have to rebuild them?

Can I really combine tag, transcript, face, and metadata filters in one search?

Features that pay off most as the library scales

Video search

Detected content

Transcript search

Face recognition

Relevant comparisons

Related problem-centered guides

Find B-roll by what's on screen

Organize footage across drives and NAS

Search videos by spoken words

Find a person in your video library

Try ClipCatalog free — up to 500 videos