There is a WhatsApp message going round that crashes the app when you touch the black point within it. And it’s a regular, everyday text rendering bug in Android. WhatsApp is just how it’s spreading. And I wasn’t going to talk about it, because text rendering bugs are now pretty commonplace and pretty dull. There isn’t a story there. You send some weird characters, you trigger an obscure bug somewhere in the text renderer, which is the bit of code that puts text on the screen, and because no-one’s ever thought to test that precise combination of characters in that precise order, because it’d never happen in everyday conversation, it crashes. Just like all the other times. And it’ll get fixed pretty soon. But there is something interesting in the characters that were used to do it this time round.
If you analyse what’s actually being sent in the Black Spot message, then there are a whole load of invisible characters. Unicode, which is the international standard for transferring text between computers, contains tens of thousands of characters designed for every language in the world. But it also has a few invisible characters defined in it. One of those is a zero-width space, which is designed to indicate a good point to put a line break in a really long word. Others dictate whether characters and the diacritics above or below them should be combined or left separated. And a few of them say whether the text should go left-to-right or right-to-left.
Lots of languages in the world go right-to-left: for example, anything that uses Arabic script or the Hebrew alphabet. If you’re writing in those scripts, then the Unicode standard takes care of everything for you: those characters are ‘strongly typed right-to-left’. Use the Hebrew alphabet and any modern system, phone or desktop, will put it right-to-left automatically.
Left-to-right is not the default, no matter how much the English-speaking world might think it is. Those strongly-typed characters mean that if you want to, say, put some Hebrew text in the middle of some English, you can do that: the bit of code that’s rendering the text will work out how long that right-to-left sequence is, leave enough space, and then carry on with the rest of the left-to-right text. Or if you want to put numbers, which are written left-to-right, in the middle of Arabic right-to-left text, then it’ll work it out the same way.
Mostly, you don’t have to worry about it. However: there are a lot of situations — edge cases — where this goes wrong. And I am not going to try and describe them, or even list them, because they’re all really confusing and boring, but in short: to help make things clearer, Unicode defines several types of invisible character that tell the text rendering system: hey, you know that next bit? Right to left. Even if it’s Latin letters, put them the other way around. Or vice versa. These invisible characters have been used for quite a bit of trickery over the years: it used to be that you could put them in a blog comment or a forum post– for younger viewers, blogs and forums were where we talked on the web back before everyone gave up and just piled inside a load of walled gardens. Anyway, put those directional characters in a comment on a badly-designed site and suddenly everyone’s writing backwards and it’s really confusing, and that passed for funny ten years ago.
Hidden inside the Black Spot text are about two thousand invisible characters that swap the direction of text back and forth. And they’re just regular characters, strongly typed: it’s like you put a Latin A and a Hebrew א next to each other a thousand times… except both of them are invisible and have a width of zero. And somewhere in that jumbled mess the system gets really confused: it can display the text, at least, but when you tap it, it has to work out exactly what character you’ve touched, so you could highlight it or copy and paste it… and those thousands of invisible characters cause a problem, the details of which will no doubt be hashed out by the Android development team in the next week or two.
Does the black point in the message have anything to do with it? No, probably not. Probably they could have picked any emoji, or maybe any character at all and put it in there, and it would still have worked. We won’t know for a while. But the Black Spot itself almost certainly isn’t the cause of your phone’s doom.
Thanks for reading.