Thursday, July 26, 2007

Five Finger Keyboards (2)

So "chording keyboards" are what I "invented" in my blog entry on five finger keyboards. My thanks to all you folk who responded with so many excellent comments. I exposed my ignorance, you diluted it with great feedback. It looks like simple binary encoding is the norm for chording keyboards, giving only 31 combinations with 5 keys. Sequenced chords give 324, and Alex is already exploring that idea. Doubly sequenced chording (the order that you lift your fingers counts) and extended sequences didn't ring any bells, so maybe that small part of my idea was novel.

Encouraged by your response, I have decided to share some of the more trivial details that occurred to me on this topic. Hopefully someone will build one of these gizmos and try them out, it's the only way to tell how well they will work.

Balance

Commenters pointed out that it could be hard to keep the mobile phone steady if you need to press all the finger buttons on one side, but not the thumb button on the other side. One edge of the phone could rest against the base of the thumb, which would help. If any buttons got clicked by the thumb base, they would be ignored since they would not have been enrolled. The two buttons top on either side of the phone could be replaced by a rocker switch for the thumb tip to operate. Many phones already have a rocker switch to control volume. The rocker switch would allow the thumb to express three values (up, down, off) instead of just two (on, off), considerably increasing our code space of 324 combinations. If all 4 fingers are pressing on one edge, the thumb tip could press evenly on both halves of the rocker switch so that it doesn't rock up or down.

Similarly, the thumb might need to press its switch while all 4 fingers don't press their buttons, leading to an imbalance. The finger buttons could be recessed in small saucer-shaped depressions. When the finger tips are straight, they would exert pressure on the rear edges of these depressions rather than the buttons, and could thus counter-balance the pressure of the thumb. When users need to press the finger buttons, they would flex their fingertips inwards.

Repeating keys

There are some buttons, like space, tab, and the cursor movement keys, that we need often need to press repeatedly. If we had to go through a multi-finger sequence each time, we would lose patience very quickly. We could cheat our way round this limitation in the following ways:
  • Suppose I must key 5 and 1 (thumb) to get to the cursor movement keys. Once I do, the soft labels next to fingers 2 through 5 all change to the familiar arrow up, down, left and right symbols. I keep key 1 pressed (that's why I assign it to the thumb, it's strong) and release finger 5, its selection job is done. I can then press and release any of keys 2 through 5 any number of times, keeping 1 down, and I get the corresponding cursor movement each time. If I press and hold any cursor key, I start to get automatic repeats, maybe 2 per second to start, speeding up to 8 per second if I persist. Once I release key 1, the soft labels next to keys 2 through 5 revert to their initial values.

  • Keys 4 and 1 might give me the 4 tab keys (up, down, left, right). What, you didn't know about tab up and tab down? You must be using a keyboard with a small number of combinations!

  • Similarly, any terminal key (i.e. one that finally selects a character) could give automatic repeats if held down long enough, maybe one second. This delay would have to be customisable by the user, since while we're learning how to use these things we'll be slow. Or maybe the phone could measure the speed with which we press key combos and adapt automatically. Why not, it has nothing better to do while waiting for keyboard input.

Remembering key sequences

We'll probably end up with learned committees meeting in Geneva to argue about which key sequences should lead to which characters. As qwerty as shown, first horse past the post gets the prize. We need to think about this problem before some arbitrary and clunky scheme gets adopted by default. Here are some rough ideas. I hope that they will provoke lots of thought and discussion, and get improved out of sight.

We need to group keys into related sets so that people find it easy to remember where they are in our code space. The alphabet and numerals are easy because they have a natural sequence. Accented alphas like à,á,â,ã,ä, and å could be located just a bit deeper in the code tree than their unaccented counterparts. Once you reach a, for example, the accented variations could appear on other keys. Since you could use up a few keys and fingers reaching the letter you want, we could adopt the same cheat that we did for the cursor keys; once you reach the bare letter, you only have to keep the last key that you clicked down, and the others keys all become available to select accented variations. If you release the last clicked key without clicking any other key after depressing it, you get the bare letter.

Here are some other possible logical groupings of special characters:
  • Punctuation: . , ; : ? ! " '
  • Arithmetic: + - * / ^ ( ) = != < > =< >= !
  • Brackets: [ ] { } ( ) < >
Yes, some characters like ! and ( ) appear in more than one logical group. Why not? We have a nice big code space, unlike those clunky old qwerty keyboards

Modal spaces

I have suggested a couple of times that you may arrive at a place in the code tree where you can release all the keys that you pressed, except for the last one, and all the other keys get new functions and new soft labels. The cursor movement keys could fall in one such space, and the tab keys another. Paging the display up / down / left / right is another obvious candidate. This mode is different from usual, where you keep all keys that you have pressed down till you get to your final destination. Having different modes could lead to confusion. At the very least, we could use clues like different soft label background colours to indicate the mode that each key is currently in, e.g.
  • a-i may denote a soft key that we have yet to click
  • a-i may denote a soft key that we have already clicked
  • á-å may denote a soft key that we have already clicked, but which now has a new function
Even today's mobile phones have input modes, so the users shouldn't find the concept too alarming. When you're capturing a new contact and you move the cursor to a phone number input field, the phone usually switches to numeric keyboard mode, while when you're entering a name, it's in alpha mode.

Happy imagineering, folks, we may be inventing some small part of the future. Which, as Alan Kay remarked, is a lot easier than trying to predict it.

Monday, July 23, 2007

Five Finger Keyboards

In previous blog entries I have talked about mobile phones replacing desktop computers and the ways in which we could run applications on them. The biggest inhibitors that phones present are their dinky screens and keyboards. Screen resolutions are rapidly getting better, and iPhone has shown that if you can use the real estate that is usually hogged by buttons, you can get a good-sized screen onto a mobile phone. But without buttons, how can you enter text, or operate the phone's menus? iPhone offers you a virtual keyboard. When you need to enter text, an image of a keyboard pops up. You activate the keys by poking them with your fingertips. But there isn't much space for the keyboard. The keys are so close together that it's hard to touch one without touching others next to it. The iPhone makes it easier for you by using a form of predictive text, similar to what normal mobiles do; it checks the spelling of each word that you type, tries to work out what keys you meant to touch, and fixes the spelling on the fly. This approach works, but you will probably use only one fingertip to key in text, and you still need to aim quite carefully, so it isn't going to be fast. Some mobile devices like the Blackberry provide a full qwerty keyboard, but space limitations are such that you have to use a stylus, or point and aim with one finger very carefully, to use such a small keyboard. There is clearly still room for improvement.

Five Key Keyboards

Let's think of another approach - the five key keyboard, one for each finger and the thumb. Not an entirely new idea - there are about 2,000 Google hits for "five [or 5] key keyboards". They have been used to control a number of simple devices. But can we scale this up to the big time, to handle general text entry of the sort that we would usually use a full-size desktop keyboard to do? Let's try.

Simple Binary Combinations

A general desktop keyboard has about 120 keys, a lot more than the five we're looking at. But suppose we allow the user to press combinations of keys on our five key keyboard? We use multi-key combinations on desktop keyboards, after all: Shift-a for A, Ctrl-c to copy, Ctrl-Alt-Del to freshen up our desktop software. Five binary (on/off) keys allow for 2*2*2*2*2 = 32 unique combinations. We need to assign one of these combinations to the gaps between keys, that leaves 31 that we can use. Enough for the Latin alphabet and a few other letters, such as a numeric shift, Alt shift, Ctrl shift.

Sequenced Combinations

That's interesting, but can we do better? Yes we can, if we take the order in which the keys are depressed into account. Not such a strange idea, since Shift-a gives a different result to a-Shift, after all. That would give us:
5 + 5*4 + 5*4*3 + 5*4*3*2 + 5*4*3*2*1 = 325 combinations.
Even less one for the gap between characters, that's quite impressive, even more than the number of combinations that we can get on a desktop keyboard using two-key combinations (a, Shift-a, Alt-a, Ctrl-a for example).

Doubly Sequenced Combinations

Can we do better? Yes we can, because once we have depressed a sequence of keys, we need to release them, and if we take the order in which we release the keys into account then we have:
5 + 5*4*2*1 + 5*4*3*3*2*1 + 5*4*3*2*4*3*2*1 + 5*4*3*2*1*5*4*3*2*1 = 15,685 combinations.
Now we're cooking on gas! That's enough combinations to handle the characters commonly encountered in Kanji and traditional Chinese writing, all with just five keys.

Extended Sequenced Combinations

Can we do better? Yes we can, because so far we have only considered the cases where we press various keys in various sequences, and then release them in various sequences. What happens if we allow sequences in which keys may be pressed, released, pressed, released, and so forth, the only limitation being that at least one key must be pressed at any time, otherwise the sequence ends? Then the sky is the limit - you would have an arbitrarily large number of different sequences at your disposal. In fact this would hold true even if you have only two keys on the keyboard. You could press one, then the other (2 combinations) and then release and repress either one or the other, gaining one net bit each time, until you have chosen the character that you want. Interesting, but tedious.

In practice, 324 combinations would be enough to satisfy most needs and to confuse most users, so we can forget about the extended sequences for now. We will return to them when we discuss the needs of people with disabilities.

Arranging the Keys

How would we arrange five keys on a mobile phone? Simplistically, we could locate four down one edge for the fingertips to operate, and one on the opposite edge for the thumb to operate. But which edge? Left-handed folk might like the opposite edge to right-handed folk. And even right-handed folk will likely switch hands from time to time; like when their one thumb is cut and bandaged (as mine happens to be right now); or when they're actually listening to a call - right-handed folk are mostly left-eared, and tend to hold their mobiles in their left hands while they're talking. They might still want to be able to operate the phone in this position to do things like:
  • increase or decrease the earpiece volume
  • put the current call on hold and answer a second call
  • terminate a call
  • initiate a conference call
So in practice, we would be better off with a row of buttons down both edges of the mobile phone. The first time a particular person picks up a mobile, the mobile wouldn't know how it is being held. It could measure the electrical resistance between the buttons and try to work out which finger is on which button. But most likely the phone would have to go through a finger enrolment procedure. It could show a picture of a hand and then highlight each finger/thumb tip in sequence. The user would respond by depressing the button that happened to be under the finger/thumb in question. This would quickly sort out whether the phone was held in a right hand or a left hand. The user may well have more flesh than just finger/thumb tips in contact with the mobile's buttons. It's common to have the base of the thumb pressing against one edge of the mobile to stabilise it. The mobile could detect this by measuring the electrical resistance between buttons, even those that aren't normally depressed. Chance depressions of buttons that hadn't been enrolled would be ignored.

This Sounds Hard!

So technically, it's doable. But how easy would it be for a phone user to memorise up to 324 key depression combinations? Or even just about 100 if the user is happy with the Latin alphabet? Well, we can make it easier by using the concept of soft keys, which are already used by Java midlets when they run on mobile devices. Most modern mobiles have at least two keys just under the bottom edge of the screen with no labels permanently assigned to them. As you work through the mobile's menus, or run Java midlets, different labels are assigned to these keys. The labels appear on the screen just above their respective buttons. They may change as you move from one screen to another. Once the user has registered the buttons that they are going to use to operate the mobile, it can pop up little labels next to each of these buttons. As the user starts depressing buttons, the labels next to the remaining buttons (and even next to the depressed buttons) could change. Assuming that the user has registered a full five buttons, they might navigate through the following sequence of button depressions to choose a character. The sequence is shown left to right. In each panel, the number indicates the button next to which the label will appear, the buttons already depressed by the user in previous panels are shown in italics and the one that they choose to depress in that panel is shown in bold. If a menu option appears underlined then that option will be taken if the user releases all of the keys, completing the character selection process.

  1. a-i
  2. j-r
  3. s-z
  4. 0-9
  5. other...
  1. a-i
  2. b-d
  3. e-g
  4. h-i
  5. other...
  1. a-i
  2. b-d
  3. c
  4. d
  5. other...

In this sequence, the user chooses successively a-i, b-d, then c to get the single character "c". Three key depressions are required in the sequence 1, 2, then 3. The keys are then released, and the mobile ignores the release sequence.

With this scheme, the novice user is guided with a series of soft key popups as to which key to depress next. With practice, users will quickly learn the sequence of depressions that give them the characters that they need most often. Users won't have to move their fingers from one button to another as they do with conventional mobile devices (or even desktop keyboards). They should be able to enter text quicker than folk do on current mobile keypads. Maybe with practice we will even be able to match our desktop keyboard skills. It sounds unlikely, but bear in mind that 93 year old Gordon Hill, a former telegraph operator, beat 13 year old Brittany Devlin in a texting speed test where Gordon transmitted the test message verbatim using a Morse code key while Brittany texted the message through a mobile keyboard, using common texting shortcuts (see here). Hopefully we could get international standards on the key sequences that give us at least the commonly-used characters and controls, so that users can migrate from one manufacturer's mobile s to another with relative ease.

Oops!

One thing we all know is that as we key characters, we make mistakes. Thank heavens for the backspace and delete keys! So if we implement the five key keyboard, how will we allow users to correct mistakes? The character set should certainly include sequences for backspace and delete, and also cursor movement keys (up, down, left, right, tab, back-tab, home, end, and so forth). If you happen to be part-way through a long sequence and depress the wrong button, the mobile may allow you to let go that button and press another instead, ignoring your first choice, if it does not implement the extended sequence scheme described above.

People with Disabilities

People with various disabilities also need to be accommodated in our digital world. When a person picks up a particular mobile for the first time, they will have to go through an enrolment process so that the mobile can discover which fingers they have positioned on which key, as described above. If the user happened to be missing one or more fingers or a thumb, they would not respond when those particular digits were highlighted on the mobile's screen. The phone would discover this, and would use a different character encoding scheme that falls within the capabilities of its user. Hopefully we can develop international standards for less than five key sequences for the commonly used characters as well. If the disabled person can operate only a few keys, we could still define extended key sequences that will allow them to access all the characters that they need to operate a mobile.

Patents Pending?

I have done absolutely no searches to find out if any of the ideas described in this article have been patented or not. If you're interested in developing any solutions that incorporate some of these ideas, maybe you should take a snapshot of the page to defend yourself against patent claims in the future. If it appeared here before it reached the patent office, it's prior art, and can't be patented. Let's all do some "open source" invention to try to improve the way things work for one another rather than always trying to make money from one another.