|Pronunciation:||IPA: Template:IPA-en or Template:IPA-en|
|Created by:||Gary Shannon, et al.||2006|
|Setting and usage:||Testing principles of collaborative corpus driven language|
|Total speakers:||Several with a recorded level of proficiency|
|Category (purpose):||Constructed language with elements of the subgenres artistic language and personal language|
|Category (sources):||An a posteriori language, with elements of Tazhu, Madjal, Swahili, Tok Pisin, and Indo-European languages|
|Regulated by:||Community moderation|
|Note: This page may contain IPA phonetic symbols in Unicode. See IPA chart for English for an English-based pronunciation key.|
Kalusa is a collaborative and corpus-driven, unplanned constructed language. The project was created by retired video game programmer Gary Shannon, and launched online in May 2006. It is most notable for its distributed, anonymous contribution process.
The definitive source of the language is a corpus of Kalusa to English translations. The only way to contribute to the language was to add to this corpus anonymously. This took place on a web interface hosted by Shannon. Due to social tensions the project was abandoned by September 2006, but not before spawning over 4,200 corpus sentences, 1,300 contributor comments, and 200 mailing list messages.
The project started with four seed sentences, posted by Shannon, with accompanying English translations. Collaborators were then encouraged to add their own sentences and translations, based on the existing material. All sentences were submitted anonymously, and could be anonymously rated up or down by the community.
The rating was quantified as a Correctness Quotient (CQ). The CQ could range between zero and 200, with a score of 100 denoting an average correctness. Sentences above that score were considered to be good examples of Kalusa by the community, and conversely those below were considered bad examples. Sentences that dropped below a certain level for a certain period of time were deleted entirely from the corpus.
As the language grew, the primary source of the language always remained the corpus. Several references works were, however, also created to document especially the vocabulary of the language. There were also out-of-band discussions concerning the language's features. These discussions precipitated an impromptu Kalusa community.
History and community Edit
In June 2004, Shannon had created a corpus driven language called Madjal. The collaborators on that project were Andrew W. Soukup, Roger Mills, David Peterson, Sally Caves, and a further pseudonymous contributor. The obvious similarity of Kalusa to Madjal is evident from a section describing the latter's rules: "The only rule of Madjal is that there are no written grammatical rules. All that is known about Madjal grammar and vocabulary is found in the corpus which includes almost everything that has ever been written in the language."
Madjal is cited as a direct influence on Kalusa by David Peterson. On 22 May 2006, Shannon announced the new Kalusa project to the CONLANG-L mailing list. In the announcement he describes that some years ago on the same list he "tried to start up a collaborative conlang project that turned out to be impractical", but that now he has "found a way to make it work". By 28 May 2006, the new language was growing at a rate of 110 new words per day.
Though submissions to the corpus were anonymous, the web interface also housed a comments system. Though usernames were required to comment, there was no login system, and pseudonyms were used almost exclusively. The discussions in the comments system were the primary means of communicating about the language, and more was written about the language there than in any other forum. As the initial announcement of the project was on a constructed language mailing list, inevitably some community discussion arose there. When the comments were seen to be so obviously popular a feature, a Kalusa mailing list was set up. This did not achieve the popularity of the comments system, but still proved a lively forum for more detailed interchange and debate.
Writing system Edit
According to community analysis, the approximate order of frequency of letters used in Kalusa grouped into frequency tiers is: a, i e, k o s u r, m t n z, d l y, p v h g b, q f, w j. The top twenty digraphs are ki, es, ia, sa, ya, za, ay, ka, is, ku, zi, ha, ko, ze, au, go, ok, se, az, iz.
Pronunciation and phonotactics Edit
As a primarily corpus-driven language, Kalusa did not originate as a spoken language, and therefore has no definitive pronunciation. There are few community comments on pronunciation, such as an early one proposing that letters should have their IPA values apart from y Template:IPAblink, sh Template:IPAblink, zh Template:IPAblink, ng Template:IPAblink and (maybe more contentiously) c Template:IPAblink, q Template:IPAblink. Perhaps facetiously, there was a suggestion that it would be good if both tt and th were pronounced Template:IPAblink.
Though there are few comments on the pronunciation and phonology of the language, there are several comments on phonotactics. Since new words were created based on corpus samples, it was important for contributors to have at least an intuitive grasp on the phonotactics of existing words in order for new words to appear consistent. Some contributors went to some lengths to get a more than intuitive grasp, as in the following comment by Jim Henry:
There are many long-established words that begin with consonant + r [...]. There are also many long-established words that have two final vowels. E.g., "krevo", "kia", "trosu", "kua". I agree that words with two final consonants, or two initial consonants where the second is not |r|, |y| ("nyava", "pyanezres"), or |w| ("kwa"), violate Kalusa phonotactics. I would prefer to have fewer words ending in a consonant [...] but words ending in consonants have been there from the beginning, such as "es". </blockquote>
Grammar and morphology Edit
Adverbs can be created by adding an -at or -rat suffix to an adjective.
As of 16 June 2006 there were 464 words in Kalusa. In Shannon's Dictionary of Modern Kalusa, which lists only 181 Kalusa words, there are 77 nouns, 40 verbs, 26 adjectives, ten adverbs, eight pronouns, six conjugations, five prepositions, five particles, three names, one suffix, one honorific, and one colloquialism. The dictionary came under criticism, however, for being too "definite" in its interpretations, and "perhaps premature". It was apparently abandoned early on, possibly due to this criticism.
Words occurring over 1% of the time in the corpus as analysed by Jim Henry were in decreasing order of frequency: ma, es, kia, da, dun, za, ira, lok, goro, kome, taya, sam, pe, bogi, kisa, and ib. Their meanings are given here as stated in the Dictionary of Modern Kalusa; another drawback of this dictionary can be seen in the fact that of these sixteen most frequent words, five are not described in the dictionary.
ma — first person singular (pron.)
es — accusative case marker (part.)
kia — of or belonging to, used to join nouns to adjectives (part.)
da — noun conversion (part.)
dun — past tense (part.)
za — no definition
ira — third person singular (pron.)
lok — be at or located at (noun)
goro — the copula be, is, am, or are (verb)
kome — to eat (verb)
taya — no definition
sam — no definition
pe — no definition
bogi — to have (verb)
kisa — no definition
ib — and (conj.)
Some of the initial words, such as elamu (apple) and palu (cat), taken or derived from Shannon's 2004 language Tazhu; they appear there as elamu and peru respectively. Others were, according to Shannon, derived in pieces from Swahili. Another collaborator cites Tok Pisin as the source of at least one attempted feature.
The lack of a standard phonology was identified as a source of problems. "Not being a spoken language, apparently little attention is being paid to the sound of the language, and words and sentences that are either unpleasant tongue twisters, or frankly childish sing-song constructions have found their way into the language."
The voting system to set CQ values was initially an open vote. Any person could vote any number times. This resulted in a large attack on the corpus on 13 June, referred to as "the massacre", causing many sentences to be deleted. This loophole was subsequently fixed, when Shannon "later modified the Kalusa software to disallow multiple votes on the same sentence from the same IP address." The "massacre" may have prompted the jocular message from Shannon on 16 June describing how a devastating eruption on the Island of Kalu had destroyed much of the Great Library.
Given the intense activity at the beginning of the project, it is not surprising to also find many positive comments about the language. Speaking of collaborative constructed languages, Jim Henry said that Kalusa "was by far the most interesting of the ones I've been involved in". David Peterson described Kalusa as "one of the most interesting collaborative conlangs I've ever seen".
Representative example sentences with a high CQ include:
- Elamu kisa dun kome au miqi teset. (Sentence 536)
- This apple was eaten by a mouse.
- Ma ziresh es awan kia elamu. (Sentence 564)
- I want a slice of apple.
- Ma dun gada lok mung kia kauno. (Sentence 587)
- I trod in dog feces.
- Za da leota-ruba. Za goro orgon. (Sentence 610)
- You are orange. You are an orange.
- Ma dun goro zhati kia biti. Za dun kome es zhati kia biti. (Sentence 685)
- I was a small bird. You ate a small bird
- Teya trin teset vige. (Sentence 724)
- Tea can be drunk.
- Zhati biti terehe poi sepahuwe, ira dun kevuzi. (Sentence 758)
- The little bird tried to fly over the rainbow.
Earliest corpus sentences Edit
The first five retained sentences in the early 29 May corpus, with Correctness Quotients, are:
Sentence Kalusa English CQ 1 Ma vito es John I see John 105.56 2 Ira vito es palu. He sees the cat. 133.33 3 Ira vito es teku kia ruba. He sees the red book. 112.5 4 Ma vito es da ruba. I see the red one. 110 6 Ma dun vito es da kisa. I saw this one. 112.5
Sentence 5 was deleted due to downvoting.
The Saga of Malia or the Saga of Malia and Kuana, a surreal modernesque folk story about a milkmaid and her calf, is cited as the first and perhaps only example of Kalusa literature. Extended uses of Kalusa outside of the corpus were rare, though the occasional use of sentences or phrases such as "Ka Kalusa da vezya!", the imperative of "Kalusa is strong!", was comparatively common.
See also Edit
- ↑ Gary Shannon's Homepage on fiziwig.com
- ↑ Kalusa Announcement
- ↑ Kalusa Corpus, Kalusa Comments, Kalusa Group
- ↑ Kalusa Homepage
- ↑ Madjal on fiziwig.com
- ↑ 2006 Smiley Award Winner: Kalusa on dedalvs.com
- ↑ Gary Shannon, Conlang Message 139670
- ↑ Gary Shannon, Conlang Message 139886
- ↑ Kalusa Comments (#879)
- ↑ Kalusa Comments (#897)
- ↑ Kalusa Comments (#12)
- ↑ Kalusa Comments (#527)
- ↑ Kalusa Comments (#755)
- ↑ Kalusa Group, eldin_raigmore, 2 Jun 2006 20:43:09 -0000
- ↑ Kalusa Group, destree2, 2 Jun 2006 19:42:10 -0000
- ↑ Kalusa Lexicon
- ↑ Dictionary of Modern Kalusa
- ↑ Kalusa Group, Jim Henry, 30 May 2006 09:45:36 -0400
- ↑ Kalusa Group, Jim Henry, 2 Jun 2006 20:14:13 -0400
- ↑ Tazhu Lexicon on fiziwig.com
- ↑ Kalusa Group, Gary, 31 May 2006 00:31:15 -0000
- ↑ Kalusa Group, David J. Peterson, 2 Jun 2006 13:24:43 -0700
- ↑ Gary Shannon, Conlang Message 142194
- ↑ Kalusa Group, Alex Fink, 13 Jun 2006 20:59:59 -0000
- ↑ Post-Massacre Corpus, on conlang.org
- ↑ Kalusa data files, on conlang.org
- ↑ Gary Shannon, Conlang Message 140469
- ↑ Jim Henry, Conlang Message 160732
- ↑ David Peterson, Conlang Message 140144
- ↑ Kalusa Corpus, 29 May 2006
- ↑ Saga of Malia and Kuana on conlang.org
- ↑ Kalusa Comments (#929)
- ↑ Cf. Kalusa Corpus, Sentences #47, #90; #48; #1759, and #1756
- ↑ Kalusa Group, David J. Peterson, 30 May 2006 07:47:08 -0000