MKBHD's editing formula, measured: 7.4 cuts a minute and one deliberate pause

There is a lot of folklore about how MKBHD videos are cut. People assume flashy captions, dense chapter markers, split screens, a firehose of jump cuts. So we stopped assuming and measured. We ran ffmpeg scene detection and vision frame classification on three recent high-performing reviews: 43 minutes of video, 323 hard cuts, 148 classified frames.

The result is almost the opposite of the folklore. The pace is conversational, not frantic. The structure is a strict two-shot alternation engine. And the single most deliberate edit in every video is not a cut at all. It is a long, unbroken pause, placed in the same spot every time.

This is an independent editing analysis. We are not affiliated with or endorsed by MKBHD or Marques Brownlee.

What we measured

Three videos: "So This is Peak Smartphone" (13:17, 4.93M views), "Samsung Galaxy S26 Ultra Review: There's a Catch" (12:35, 3.75M), and "The Truth About the 'Whoop Killer'" (17:42, 4.33M). For each one we ran ffmpeg scene detection at a 0.30 threshold to find hard cuts, sampled frames every 15 to 20 seconds plus extra samples at the hook, section pivots, and ending, and classified each frame by layout (talking head, b-roll, graphic, screen recording, split). This was part of a larger study covering three long-form creators, 963 cuts and 502 frames in total; the cross-creator laws are written up separately. This piece is the MKBHD slice.

The headline numbers

7.4 cuts per minute on average (range 6.4 to 8.9 across the three videos). That is a cut roughly every 8 seconds. For comparison, Fireship runs about 13 per minute. MKBHD's pace is conversational.
Median shot length 3.6 to 6.7 seconds per video, about 5.5 to 6.5 seconds in aggregate, with the 90th percentile around 16 seconds.
48% talking head, 43% b-roll of classified frames. Full-frame graphics and title cards are 0 to 9% per video, screen recordings under 1%, and split layouts exactly 0%.
One 32 to 55 second static hold per video, always a reasoning or verdict monologue, always placed at 72 to 92% of the runtime.

The cut rate is not flat across a video. It front-loads, settles, and then deliberately collapses at the end. Here is the shape, measured as cuts per 30-second window:

The hook runs 10 to 14 cuts in the first 30 seconds, 1.1 to 1.9 times the body rate. The body settles into the 7.4 per minute cruise. The last 30 seconds drop to 2 to 5 cuts. The editing literally slows down to let the verdict land.

The engine: talking head alternating with b-roll, every 20 to 40 seconds

Strip away everything else and an MKBHD review is two shot types traded back and forth: full-frame talking head (48% of frames) and product b-roll (43%). The layout flips on a roughly 20 to 40 second period. That is the whole machine. There are no picture-in-picture layouts, no split screens (0 of 148 frames), and almost no screen recordings (under 1%).

This matters because it separates two rhythms most editors conflate. The micro rhythm is the cut every 8 seconds or so. The macro rhythm is the layout change every 20 to 40 seconds. You can jump-cut a talking head forever and still feel static, because the frame never changes character. MKBHD's edit passes both: within a talking-head stretch the camera angle and framing cut on pace, and before 40 seconds elapse the video swaps to b-roll of the product, then swaps back.

The practical takeaway: when you review your own timeline, do not just count cuts. Scan for any 40-second stretch where the layout class never changes. That is where attention leaks.

The hook: thesis at 2.4 seconds, logo after

Two of the three videos speak the thesis about 2.4 seconds in. "So, this is peak slab phone" is the literal opening line of one of them. No logo first, no montage first. The roughly 3-second branded intro animation exists, but it only ever appears after a spoken tease. The third video opens with a 42-second wordless skit, which is the exception that proves the confidence: he can afford one because the channel has 15 years of trust, and even then it is a deliberate bit, not a slow ramp.

The hook is also where the cut rate peaks: 10 to 14 cuts in the first 30 seconds. Fast opening, spoken promise, brand sting second. If you are borrowing one thing from this section, borrow the order.

The text system: zero burned captions, four precise rituals

Not one of the 148 classified frames has burned-in speech captions. Zero. This matches the wider study, where 0 of 502 frames across three top creators had them; we wrote that finding up in why top YouTube videos don't burn captions. On-screen text in an MKBHD review is instead four specific, repeatable devices:

1. The spec card. One white card with at most 4 bullets, laid over slow hero b-roll, about 20 seconds into the video. It appears once and never again.
2. Source-credit tags. Every borrowed clip carries a small black diagonal credit tag. Every single one.
3. The scorecard pivot. A checkbox scorecard graphic marks the "here's the catch" turn of the review. This is the structural hinge, and it is a graphic, not a chapter card.
4. Data charts and product labels. Full-frame custom bar charts for numeric comparisons, and plain product labels under side-by-side comparison shots.

The one deliberate pause

In each of the three videos, the longest unbroken shot runs 32 to 55 seconds, and it is always the same thing: a reasoning or verdict monologue, sitting at 72 to 92% of the runtime. After ten-plus minutes of alternation, the cutting stops, the camera holds, and the argument gets made in one take.

This is the most copyable insight in the whole dataset. Static screen time is a spent resource: an accidental 45-second static stretch mid-video is the classic amateur tell, but the same 45 seconds placed at the verdict is a feature. The stillness signals "this is the part that matters." It only works because everything before it moved.

Budget exactly one long hold per video. Place it at the payoff, past the 72% mark. Spend it nowhere else.

The ending liturgy

The endings are formulaic in the good sense: verdict recap, then a question aimed at the comments, then the fixed sign-off ending in "Peace." The cut rate drops to 2 to 5 cuts in the final 30 seconds, and speech ends just 4 to 6 seconds before the video does. That is more than 99% of the duration carrying actual content.

What is missing is the point: no endscreen slate, no subscribe animation, no dead outro segment. The video simply finishes talking and stops.

The formula is also what he leaves out

Commonly assumed	Measured reality
Burned-in speech captions	0 of 148 frames
YouTube chapter markers	0 of 3 videos; segmentation is done in-edit with title cards, the scorecard pivot, or a hard cut back to studio framing
Split-screen comparisons	0% of frames; comparisons use sequential shots with product labels
Frantic MrBeast-style pacing	7.4 cuts/min, a conversational cruise
Endscreen outro	None; speech runs to within 4 to 6 seconds of the last frame

The restraint is the style. Every device in the videos is one of a small, fixed set, executed identically every time.

How to reproduce it, step by step

1. Speak the thesis in the first sentence. Target under 10 seconds; the measured openings land around 2.4 seconds. Brand sting only after the tease, and keep it around 3 seconds.
2. Build the alternation engine first. Lay out talking head and product b-roll so the layout flips every 20 to 40 seconds, aiming near a 50/50 frame split. Then cut within each stretch toward a 5.5 to 6.5 second median shot.
3. Run the hook hotter. 10 to 14 cuts in the first 30 seconds, then relax to the body pace.
4. Do the spec-card ritual. One card, 4 bullets max, over slow hero b-roll about 20 seconds in. Never repeat it.
5. Mark the pivot with a scorecard, not a chapter. Credit-tag every borrowed clip. Use full-frame charts for any numbers.
6. Spend your one pause on the verdict. One 30 to 55 second unbroken take, placed at 72 to 92% of the runtime.
7. End clean. Recap, question to comments, fixed sign-off, hard stop. Speech should end within about 6 seconds of the final frame. Delete the endscreen.

Where WritePanda fits (and where it doesn't)

To be honest about it: no tool gives you MKBHD's camera work, product b-roll, or fifteen years of taste. What an editor can automate is the rhythm layer. WritePanda's agent can cut a talking-head track to a target pace, flag any 40-second stretch where the layout never changes, place zooms on transcript beats instead of a grid, and build the spec cards, scorecards, and bar charts as motion graphics, while leaving your verdict take untouched. This study is literally what our long-form editing style guide is calibrated against. If that workflow sounds interesting, the full picture is in how to edit videos with AI agents.

Try PandaStudio free

FAQ

How many cuts per minute does MKBHD use?

Across the three videos we measured, 6.4 to 8.9 hard cuts per minute, averaging 7.4. That is a median shot of roughly 5.5 to 6.5 seconds, far slower than the fast-cut style people associate with big YouTube channels.

Does MKBHD use burned-in captions?

No. 0 of the 148 frames we classified contained subtitle-style captions. On-screen text is limited to one spec card per video, source-credit tags on borrowed clips, a scorecard at the review's pivot, product labels, and full-frame data charts.

Does MKBHD use YouTube chapters?

Not in this sample: 0 of 3 videos had chapter markers. Sections are signaled inside the edit itself, with a section title card, the scorecard pivot, or a hard cut back to the studio framing.

What is the "one deliberate pause"?

Every video in the sample contains exactly one long unbroken shot of 32 to 55 seconds, always a verdict or reasoning monologue, always placed at 72 to 92% of the runtime. It is the longest static stretch in each video and it is clearly intentional: the editing slows down precisely when the conclusion arrives.