|
Post by Admin on Apr 30, 2020 16:44:29 GMT -5
I didn't think I'd ever get to this point, but I think I have to finally "bite the bullet" so to speak.
Up until now, my goal was to support EVERY Cyrillic letter, even all of the archaic ones. But, this is starting to complicate my goal for the "new, improved" Cyrillic layout. The truth is, no one writing in a modern Cyrillic-based language uses any of those letters, except for linguists and historians.
So, other than a very few of them, there's going to be no more archaic letters. The plans I had to make a number of "archaic" dead key tables isn't going to happen.
I have reviewed all 119 Cyrillic languages listed on Omniglot several times, so I'm pretty sure no one will miss the ones I am taking away. But, on the off chance that I yanked one I shouldn't have, and there is some user feedback, I can always find some way or other to get them back. But, I'd rather not, because the keyboard's memory capacity it getting low as it is (a familiar problem with the Q Keyboard), so if there are places where I can safely take things out, I am going to.
|
|
|
Post by Admin on Apr 30, 2020 20:52:06 GMT -5
The "clean-up" process is going pretty good. The tables are a lot cleaner and simpler now. I think they will make more sense and be easier to remember without all the archaic letters cluttering up everything. One key phrase I got in doing research on this was, "every Cyrillic Unicode symbol that says 'combining' is obsolete". Big clue there.
I was tempted to keep 3, the ones for combining hard sign, soft sign and modifier En ("H"). But the truth is, even those aren't needed. While I did see a couple of languages that use ordinary combining marks like Macron, Caron, etc. no one used the old combining Cyrillic symbols; not a one. So, ALL the combining symbols are gone.
It's actually kind of a relief to have all that complexity removed.
|
|
|
Post by Admin on May 2, 2020 13:08:43 GMT -5
Well, you know that saying, "you can't get something for nothing"? It's actually true ... darn it.
In my case, I had this grand idea of having two keyboards in one: Latin and Cyrillic, and you could swap them at will.
And you will still be able to.
Except ...
Under my first "grand idea", the "Cyrillic" keyboard still had a whole set of Latin letters as "live keys", and the "Latin" half of it had all of the Latin-based dead keys "translated" into Cyrillic with software, so that there was only one 'real' set of dead key tables for both keyboards.
For example, if you wanted to use Cyrillic ë as a dead key, the "e" part of it was a Cyrillic "e" which is U+0435. Then, the CYRILLIC ë which is U+0451 would appear. If you went through what I called the "Latin Gateway" to get a LATIN "ë" you would be using the LATIN Diaeresis table to translate CYRILLIC U+0435 into LATIN U+00EB which is Latin "ë".
This is all a little hard to explain. ALL of my dead key tables are prepared as if the physical keys being typed are Latin. Then, when the data tables are prepared for use by the keyboard software (KbdEdit) I use a "macro" to convert the Latin character codes into Cyrillic. I do this so I don't need to keep and maintain two sets of dead key tables. And also, I have changed at times the exact set of Cyrillic letters I consider to be the "base" letters. I have the freedom to do that, whereas the physical keys on the QWERTY hardware can't be moved.
Anyway ...
Now, we can't do that (share dead key tables across Latin vs. Cyrillic), because neither the Latin side nor the Cyrillic side have "extra" letters in the "other" keyboard's alphabet. And, there is only one set of dead key tables for the whole thing. So, if I tried to maintain the Latin Gateway, it wouldn't work, because there would no Cyrillic for the Latin side, and vice versa.
(In theory, it's possible to duplicate every entry, so that for the letter E, you'd have one Lain E entry and one Cyrillic E entry for the same thing. The problem is (as usual) there's just not enough memory to hold completely redundant copies of every dead key entry. All the dead keys in total have about 4,000 entries. Other than tweaking what I have, there isn't a lot of room left to do more, and certainly not enough room to completely double their size. That's out.)
The net result is that the Latin side can only use Latin dead keys, and Cyrillic can only use Cyrillic. They can't "cross over" or be shared.
So, users WILL be able to fully write in Latin, or in Cyrillic, but just not at the same time.
For the two different versions, the only differences are which "kind" you get when you first start your computer. The default "kind" is the one that respects Caps Lock, while the other side does not. Otherwise, they will work the same way.
It's the best I can do, and for most people I am hopeful that will be enough.
|
|
|
Post by Admin on May 3, 2020 15:04:44 GMT -5
Once again, as soon as you think you "know" what languages do, they go and do something else.
I wrote above that the Latin Gateway couldn't be used, because for both "modes" of the keyboard to share all dead key tables, you'd need two sets of them: one with the "main" keys that you type (the "live" keys) encoded in Latin, and one in Cyrillic. And, that's still true.
But, I have run into cases where Cyrillic languages include Latin in their alphabets. A few cases I have already accounted for, such as A E and O with accents like Macron. But, that doesn't cover all cases. I need to allow the basic Latin alphabet, just the plain letters, to be accessible via the Cyrillic mode. After that, users may need to add a Macron or Dot Above or whatever else they might need.
That should cover most cases, but it doesn't account for when Cyrillic alphabets have Greek in them.
So, the story isn't quite over yet ...
|
|
|
Post by Admin on May 4, 2020 18:41:20 GMT -5
One of the "problem" languages I am dealing with is Neo-Assyrian. Their alphabet includes a Latin T with Dot Above. My current design has no way to add such a letter that makes sense and that would be of general use.
In the old design, to get this letter I would type 9 to get the Latin Gateway, then "." for the Dot Above dead key, then the T key. Done. Now, it's not so simple. I would have to type Alt+Caps Lock to switch to Latin mode, then Alt "." to get the Dot Above, then T, the Alt+Caps Lock to go back to Cyrillic mode.
That's a lot of typing. Not something you'd want to do very often,
Now, people don't embed Latin into Cyrillic languages very often. But, it does happen. They also include Greek sometimes too. I would have the same problem.
Now, it's possible I could add duplicate entries for Greek so that you could use both Latin and Cyrillic keys to access them. There MIGHT be enough memory left to do this just for the Greek letters (none of the accented forms). That may be what I am going to have to end up doing.
But, this is not nearly as convenient or clever as the old way of doing things. However, the whole reason I *changed* the old way of doing things is that for run-of-the-mill Cyrillic, like Russian, the old way was a little too hard to type.
So, I have to figure, do I want great flexibility and power more, or do I want ease of use more?
Then, there is this REALLY crazy idea ... suppose the Cyrillic and Latin dead key tables were merged, so that for instance there was only ONE Diaeresis dead key table, which had entries for both Latin and Cyrillic, rather than two distinct tables. It would require a massive restructuring, but the net result would be that there would be NO MODES any more. Instead, you'd have ONE keyboard that did everything. It would mean I'd have to make 3 modifiers like the Q Keyboard does.
This is pretty crazy. I will have to give it some thought. Hmm ..
|
|
|
Post by Admin on May 5, 2020 13:41:27 GMT -5
OK, gave it some thought ... It can be done, and it won't be crazy. But there has to be some ground rules.
1. This will take 3 modifiers; that can't be helped. So, there will be AA, BB and CC. AA is the left Alt, BB is the Win key, and CC is the left Ctrl.
2. There will be two versions of the keyboard. That can't be helped. One is "Primary Latin" and one is "Primary Cyrillic".
3. The unshifted ("base") keys, and keys using just Shift, are the "primary" keys. For Primary Latin, those are just the normal QWERTY keys of a U.S. keyboard. For Primary Cyrillic, they are a set of keys I decided upon that look "sort of" QWERTY-like; on the "Q" row they are Ю Ш Е Я Т У Ц І О Р.
4. For the non-primary ("other") alphabet, they will be on the CL and CU keys. So, for Primary Latin, the secondary Cyrillic "base" keys are on CL and CU, and vice versa.
5. Because Cyrillic *needs* more letters (as compared to Latin, for which it's merely 'handy' to have more), the contents of the keys on AA and BB won't be identical in the two versions. Otherwise, I'd have to give up some "live" keys I especially need for Cyrillic in order to "cram" the two versions together into one, and that's just too high a price to pay.
6. The result of all this is that when you are typing the "other" alphabet, you get that alphabet's "main" keys as Live Keys, but anything else has to be accessed via Dead Keys. That's the breaks; I can't do it any other way.
7. There will be one set of Dead Key tables, and they will have entries for both languages. So, the same Acute Dead Key will have the same Acute table that has both Latin ǵ with Acute and Cyrillic ѓ (Ge) with Acute.
What are the advantages of this approach?
1. No "mode shift" required. The key sequence to do a mode shift is somewhat clumsy, and is surprisingly slow. I have observed it taking about a whole second, perhaps a tiny bit more, to register the change. And if you do it too fast, the keyboard seems to lock up at times. So, the whole idea of mode shifting was quickly losing its appeal anyway.
2. Any symbols on the Secondary alphabet's dead keys should be available on the keyboard in use. So, for instance, Latin T with Dot Above will be available from the Primary Cyrillic keyboard.
3. Previously, I was adding "convenience Latin" letters to the Cyrillic keyboard, such as Latin C with Acute, since a few Cyrillic languages used it. What people *probably* did was add the Acute combining modifier to a Cyrillic C. But, being able to use a precomposed letter (even if it IS in the "wrong" alphabet) is better than adding modifiers, which often don't look that good.
What are the disadvantages of this approach?
1. Having 3 modifiers (again) puts us back in the "complex design" camp. I wanted to avoid that, but here I am again. But, the reality is, users of Cyrillic must often "live in two worlds". What's more, Cyrillic typists have had to deal with keyboard difficulties for a long, long time. What I am proposing is no worse than what they have had to cope with, and besides, it provides a solution they simply don't have: the ability to type ANY Cyrillic language on one keyboard, and ALSO type Latin with a full set of accented letters at their disposal. Their only requirement is deciding whether they want to type Cyrillic more, or Latin more. Either way, I have it covered.
2. I can't make the AA and BB sets of keys identical between the two versions. I don't want to, because each version is "optimized" for the mostly likely use cases. If they really need what the "other" keyboard does better, they can just go to Windows and have it swap to the "other" version.
3. There is the matter of Greek to deal with. I have to pick a dead key for it, and decide how letters are accessed. Before, to get a Greek Δ I would type the Greek dead key [ and then capital Latin D. I could either (a) require Greek to selected with Latin letters, or (b) use what letters are native to the given version, or (c) duplicate everything so I'd only need one table for both. Besides using up precious memory, option (c) is probably not necessary. I am guessing I need option (b). So, the Greek dead key table would exist in two forms, one where Latin letters selected the Greek letters, and one where Cyrillic selected Greek.
I am sure there will be more to it than that, but I am seeing light at the end of the tunnel.
|
|
|
Post by Admin on May 7, 2020 11:53:09 GMT -5
I am making progress ...
The new keyboards will have 10 levels. These are Base, Shift, AL, AU, BL, BU, CL, CU, AB and BC. Unlike the Q Keyboard, the new ones will not have AC or the hyper keys ABC and BCU. So, 10 levels instead of 13.
In order to try to have as much commonality as possible, I have divided up the keyboard's key assignments into what I call "zones". These are as follows:
1. The Host zone, consisting of all letter keys A-Z and digit keys 0-9, when used as Base or Shift keys.
2. The Guest zone, consisting of all letter keys A-Z and digit keys 0-9, when accessed with CL/CU.
3. The National zone, consisting of all letter keys A-Z but NOT the digit keys, when accessed with BL/BU and AL/AU.
4. The Common zone, consisting of all other key and modifier combinations not mentioned above. The Common zone is the same between the Latin and Cyrillic versions.
The following steps are taken to help remember how to use the two versions:
1. The Host and Guest keys are the same letters on both versions, simply with their modifiers exchanged. So, for the Latin version, unshifted D is "d" and on Cyrillic it is "д". For the Cyrillic version, the two letters are exchanged.
2. In the National zone are letters "important" to that version. So, for the Latin version, there are "extra" keys like æ and å and ø while the Cyrillic one has variations of Cyrillic besides letters seen in Russian. It's not an "exact science" deciding what particular letters get placed in the National zone. I depend on research to try to determine what letters are used most and what would be typically needed in many languages.
So, the Latin alphabet will have, for example, Ł with slash stroke, but the Cyrillic one will not, because users of Cyrillic alphabets just don't need Ł with slash stroke. If on rare occasion they DID need one, they can use the / dead key and the L from the Guest zone to easily produce one.
3. The Common zone has punctuation, popular currency signs, math symbols and other characters that generally do not belong to any particular country, language or alphabet.
4. For Greek, I am going to simplify this so it works the same on both versions. For Latin and Cyrillic, it's necessary to retain separate sections in the Dead Key tables, since there could be common letter forms. For example, there is a letter ö in both Latin and Cyrillic, so it's necessary to be able to type both of them, because they have different Unicode values. For Greek, that's not the case. Greek Δ is Greek Δ. We only need one "copy" of them. So, for the Latin keyboard, the Greek dead key table will respond to the native Latin keys, and for Cyrillic, it will respond to native Cyrillic. From a typist's point of view, you will use the same keystrokes and get the same results.
Because Cyrillic needs more letters, the numeric row is taken up by letters. This has the following consequences:
1. In the Cyrillic version, the number keys 2 through 0 (going left to right) will respect caps lock, while key 1 will not. That is because key 1 will contain the Cyrillic PALOCHKA. The common practice with this letter is to only use the "capital" form, and we don't want that changing in case Caps Lock were on. A formal "Small" PALOCHKA exists on the shifted key for special cases, but most Cyrillic users will rarely if ever need it. BTW, historically, the PALOCHKA was on the 1 key on various Cyrillic typewriters, so for experienced Cyrillic typists, this should feel right at home.
2. In the Latin version, Caps Lock will work the same as standard QWERTY.
3. For Cyrillic users, access to digits and the punctuation on the digit keys is done by accessing the Host keys on CL and CU.
One thing you will notice in the Common zone is that all punctuation, other than what's on the digit keys, is the same on both versions. I had been exploring the use of [ ] and \ for things like Soft Sign and Yeru, but I felt it was important that punctuation be standard and readily available, something not always true in native Cyrillic keyboard layouts. So, keys like ь ъ ы ӹ are on digit keys starting with 6 and rightward. That decision also means that the Cyrillic letter б Б which corresponds to English "B" doesn't have a key of its own. Instead, it is typed as AL B. The reason for doing this is that key 6 (where it used to be) is just too far of a "reach" to type all the time, whereas keys like ь ъ ы ӹ are seldom used and having them on the number row should not pose a typing hardship.
As always, all of this is just "theory" and will have to be tested to see if it really works out.
|
|