Comments on "3 TB disks are Here" from Linux Magazine

Linux Magazine published an article last week, 3 TB Drives are Here. On Twitter, I originally said it was wrong, but that’s a bit harsh. Parts of it, however, very misleading, and parts of it unnecessarily confusing.

The “2.199 TB” limit describes Logical Block Addressing (aka LBA), a scheme for addressing sectors on modern disks. Sectors are numbered 0 to n, where n is a number dependent on the disk’s size (i.e. disk size in bytes divided by sector size). There’s nothing intrinsically limiting about LBA, other than how many bits you can devote to store such an address. With this in mind, the sentence:

The LBA scheme uses 32-bit addressing under the MBR partitions.

is very misleading. I hate to be a grammar nazi, but it’s a misuse of active versus passive voice. This phrasing makes it seem as if LBA is the limitation; it’s not. Master Boot Record (MBR) blocks are what limit LBA addresses to 32-bits, and are what limit partitions to 2.199 TB.

The article then moves to discuss 4 KB sectors. While nothing here is wrong,it ignores the fact that current “4 KB sector disks” on the market (i.e. marketed as “Advanced Format”) do not work in the way described.

Most Advanced Format disks continue to report that their sectors are 512 bytes, a mode called 512e. Because of this, your “4 KB sector” disk still is limited to 2.199 TB when using MBR partition tables (the article, confusingly, implies otherwise).

However, they do use 4 KB sectors internally. That is, a request for sector 0 and 3 both, internally, request the same 4 KB sector. There are significant performance problems here: if you request sector 3 and 4, these internally map to two different 4 KB sectors. This becomes a problem when your filesystem uses 4 KB blocks (i.e. most modern filesystems, including NTFS, ext4, XFS, etc) that are not aligned to these boundaries: a 4 KB read may cause the drive to unnecessarily read 8 KB. The article does not mention anything about this sector alignment problem.

Discussing other operating systems, the article vaguely mentions “several operating systems” have switched to GPT (GUID Partition Tables). I really hate how vague the article is here: as far as I know, the only OS that does this by default is Apple’s Mac OS X. The article sells Linux short when it says:

In the consumer world this is a downside since most motherboards don’t have a BIOS that is GPT capable. This can affect all operating systems including Linux.

because, in fact, most motherboards do have a BIOS that can boot from GPT, especially when you use a hybrid MBR. And Linux, with GRUB 2, works fantastically with them. Unfortunately, compatibility is a crapshoot, and is not advertised. However, all the systems I’ve experimented on, some as old as 2005, worked fine booting from GPT. Where Linux definitely falls short is that no distribution (AFAIK) will setup a GPT for you.

With that in mind, it’s difficult to say:

Linux is ready for 4KB drive sectors with 64-bit LBA addressing

When it really isn’t. The largest obstacle is the sector alignment problem that the article glosses over, best explained by Theodore T’so’s Aligning filesystems to an SSD’s erase block size. His post, in short:

  • Linux partitioning utilities are hard-coded to assume 512-sectors, which create problems for 4 KB-sector disks and disks with larger block sizes (i.e. SSDs)
  • Various filesystem structures are not aligned to 4 KB boundaries (T’so points out LVM)

All of which kill performance, and in the case of SSDs, shorten lifespan.

One thing that bothers me about this article is that while it tries to explain the issues involved with 4 KB sector disks, it does nothing to tell you how to mitigate or avoid any of them. In the next couple of weeks, stay tuned for a few articles from me explaining how you can get around them with Linux.

gpxsplitter: Split GPX files with their waypoints

gpxsplitter splits multi-track GPX files, containing waypoints, into individual one-track GPX files with their respective waypoints.

GPX files containing multiple tracks and waypoints jumbled together are produced on export by many GPS units, particularly MTK chipset-based devices such as the Qstarz Q1000 and Transystem i-Blue 474. Separating tracks and their associated waypoints was a headache until gpxsplitter came along. It’s meant to be run first-thing after downloading data from your unit via gpsbabel or mtkbabel. It’s a quick little script written in Python 2.x, with dependencies on mxDateTime and lxml.

You can get it from the gpxsplitter repository on gitorious, and the GpxSplitter wiki page is the one-stop place that will collect information about it.

I thought about turning this into a web service, where users can upload their GPX files and have them split, but I’d like to know the demand for such a service before writing it. Ideally, gpxsplitter should be part of gpsbabel or something… but yeah, I’ll save the XML parsing in C for a very, very rainy day.

There are probably any number of bugs. If you find one, please let me know—and send a testcase too!

Play WebM in Internet Explorer 9

Update: Google now offers a WebM plugin for Internet Explorer 9, much easier than what I’ve detailed below.

Google’s recent announcement deprecating H.264 for Chrome (see my thoughts on it) means it’s likely that WebM will become the defacto standard for the HTML5 video tag, supported by Internet Explorer 9. Unfortunately, Internet Explorer 9 does not (yet) ship with WebM, despite a lot of misleading PR indicating some kind of “compatibility”.

So, how do you play WebM with Internet Explorer 9?

The easiest way is to use the DirectShow filter pack from Download and install the installer, available for both 32-bit and 64-bit Windows, and not only will you be able to play WebM/VP8, but also Ogg/Theora, Vorbis, Speex, and FLAC. It’s an royalty-free, open-source standards smörgåsbord!

What do you do next? Of course, submit feedback! Click Send Feedback under Internet Explorer’s Tools menu, and simply ask Microsoft: please support WebM!

Clarification: Don’t install the Support for HTML

Note: Internet Explorer 9 is a beta, as well as Xiph’s DirectShow filters. IE9 doesn’t support a lot of

Google Chrome deprecates H.264: the right move, but little change for HTML5 video

Google has decided to deprecate H.264 in Chrome. This is nothing but good for the future of web video. With support in three major browsers (Firefox 4, Chrome, and Opera) it means that WebM/VP8, instead of H.264, will become the defacto codec for HTML5 video.

I’ve talked to several people who think that this move has killed HTML5 video. I’m not sure I follow the logic — little has changed, except what will become the dominant codec.

You can say it’s made Flash the least common denominator, which ignores the fact that Flash already IS the least common denominator for web video.

Regarding codec fragmentation, little is changed there too: Microsoft’s Internet Explorer 9 and Apple’s various Webkit products still do not have WebM/VP8 support. Content providers wanting to support HTML5 still need to encode to both H.264 and WebM.

With the codec fragmentation problem as yet unsolved, do content providers have any reason to use HTML5 video when Flash still is the least common denominator? Well, Flash is no longer included with Windows 7 or Mac OS X (and was never included with any reputable Linux distribution). Are content providers still willing to force users to download plugins, when they can just use the dominant HTML5 video codec?

I don’t have the answers to these questions, nor does anyone else. Nobody said that the problem of open video would be solved easily or overnight. But focusing on WebM is, in my opinion, a step in the right direction.

In the meanwhile, WebM is winning, so why don’t you start encoding your videos to WebM now? On SamatsWiki I’ve a sparse page on encoding to WebM (which will work with stock Debian/Ubuntu tools), as well as one on encoding to Ogg Theora. If you’re on Linux, the easiest way to convert videos is OggConvert, an easy-to-use GNOME-based GUI. Publishing them on the Web is just as easy. Check out the HTML5 video chapter in Mark Pilgrim’s Dive Into HTML5, or Jakub Steiner’s How to get your clips on the web.


Subscribe to Samat Says RSS