Skip to main content

Tape Digitization

Like many families who had children grow up in the 90’s and early 2000’s, my family has a wealth of video tapes, onto which my and my siblings' various antics were recorded.

It’s easy to forget just how an incredible chore it is to watch one of those tapes. Rewinding, fast-forwarding, no random access…simply a pain. Thus, also like many families, those tapes have been sitting in a closet almost since the moment they were captured.

For the larger Hi8 or Digital8 tapes, we had a tape deck…that was nonfunctional, while for the smaller DV tapes, we had no playback system at all besides the camcorders they were captured on.

Thus, I made it an objective of mine to digitize all of our tapes. I don’t recall how long ago it was, but I definitely had a “digitize tapes” item on my todo-list for the past year and a half, or more. Early this year, I finally began.

Mini-DV

First, I decided to tackle the newer mini-DV tapes. Luckily, we still had two mini-DV camcorders in good shape. All of the tapes we had were still playable on these two devices.

mini-DV tapes encode a 480p digital bitstream onto the tape. In order to capture the digital stream and not an analog approximation of it, I had to use Firewire. A couple old laptops in the house had Firewire ports, but they didn’t seem to recognize the camera, neither on Windows nor Linux. I ended up purchasing a PCI Firewire card for my desktop for $25.

On the software side, I captured the incoming stream using dvgrab, a wrapper around Linux’s Firewire library aimed at capturing DV signals. After capture, the raw file is in .dv form. I then converted it to H.264 with Handbrake. The capture process was fairly hands-off, though occasionally dvgrab would segfault when there was a hiccup in the video stream for any reason (dirty tape, bumping the device). That’s what you get using a C++ utility written in 2000, I suppose.

The transcoding process took the most time, but was also hands-off. Each tape holds an hour of content and took about 10 to 15 minutes to transcode on my machine. At one point, I got lazy and attempted to compile a version of Handbrake with AMD VCE hardware encoding support. In exchange for 1/3 the encoding time, though, the video quality suffered drastically. Thus, I begrudgingly settled for long encode times.

DV includes timestamping in the digital stream, and dvgrab automatically splits clips on the tape into their own files, so I did not have to do any splitting or merging of the video after transcoding. Since each tape often only recorded one event, I simply used a line of shell to rename all the files in the output directory to have the desired prefix, then copied them to storage. I worked on these tapes until May or so.

Hi8

From late May to now, I worked on the Hi8 tapes. I was more worried about these tapes, as they were older. Both the tape deck mentioned earlier as well as an old camcorder couldn’t play the tapes, so I ended up getting another secondhand camcorder. This ended up costing about $100. Fortunately, things were smooth-sailing after that. All the tapes played perfectly, save for one recorded at a beach with damaged audio.

This camcorder came with a cable from 3.5mm to RCA (one video pin and one mono audio pin). I bought an RCA (yellow-red-white)-to-HDMI converter and connected the camera into it, leaving one of the audio ports dangling. I could have used a splitter cable to split the mono audio pin into stereo before going to HDMI, but I couldn’t find a splitter in our messy boxes of old cables. After going to HDMI, I used one of the many dirt-cheap HDMI-to-USB capture dongles available online, finally connecting it to my desktop machine.

On the software side, I used OBS to capture the video stream from the USB device, encoding it directly to 480p H.264 as it streamed in.

Because the data is analog, there are no clip markers in the stream and OBS outputs every tape as a 2-hour-long MP4 file. Thus, I had to split the parts manually. I cooked up a quick and dirty script in Racket that would read in a text file of clip markers in the following format:

1999-01-01-my_event_here 0:01 10:00
1999-01-01-my_event_here 10:00 11:00
...

and output a list of ffmpeg commands to split the original file into the correct pieces. The commands would then be piped to GNU parallel to run in parallel.

Finally, for good measure, I would have to correct the audio stream. Since I did not use an analog splitter, the audio was all in the left channel. I simply had the Racket script add an additional ffmpeg filter to duplicate the left channel audio into the right channel, doing the split in software.

Since there was no need to transcode the video stream, the ffmpeg commands executed very quickly, usually processing one 2-hour tape within a minute at most, leaving me with a group of nicely split MP4 files.

UPDATE 2021 July 2nd

My oldest sister suggested that we had even older VHS tapes. After some hunting around the house, I’ve come up with about 20 more VHS tapes to be digitized. Luckily, the flow is identical to that for Hi8 tapes.

Storage and Presentation

I store the files on my NAS server, which I built in late 2019. It runs Debian Stable with ZFS, with nightly encrypted backups to Borgbase. I’m currently on their middle tier plan, which is $80 per year. Not bad. The total pool capacity is 24T (4x8T raw capacity, 8T lost to RAID), of which I’m using 1.33T (including files other than these tapes).

In terms of access and presentation, I expose SMB locally so my family members can directly access the files from their Windows machines. I also expose an HTTP server using filebrowser, providing a lightweight HTTP file explorer.

Wrapping Up

I could not have done this project, at this price point, without the array of high-quality open source tools at my disposal. Most important ones probably were Handbrake, ffmpeg and OBS, and the backing libraries used by them such as libx264. I’ll definitely drop those projects a donation.

Of course, I didn’t forget the original objective of this project, which was to allow my family to more easily watch old recordings. In digital form, we can now jump between files and seek within recordings with one click of a mouse button, instead of having to switch tapes and rewind back and forth. This has been a real boon, and my family has already spent three weekend afternoons enjoying old recordings of us kids when we were little.

It also made me quite grateful for the hard work my parents put into raising my siblings and I. Tape after tape filled with birthday parties, extra-curricular activities, and holiday celebrations gave us all a reminder of how our time was spent over the last couple decades.

This marks the end of the tape project. I’m sure we’ll find a few straggler tapes around the house eventually, but all the tapes I could find have been digitized, sorted, and boxed up neatly. I’m glad I was able to finish this project before relocating to Seattle, as my progress would have surely slowed down drastically otherwise.

The remaining looming task is to digitize photos. Photos are much more work to digitize, because I have to sit there and baby them as they go through a scanner. For tapes, I simply let it roll while I’m at work, occasionally starting a transcode or copying files around. While I’ve already digitized a dozen or so albums, it doesn’t help that there’s many more photos than tapes. With my upcoming relocation to Seattle, forward progress here isn’t looking too good. Hopefully I can continue chipping at it over the next few years.

Until next time!

A terminal, showing the output of a Linux du -sh command. The output is 471G for the current directory.