The current version defines two types of documents: the global elements below...
The global types are available for reusing through schema type extension/restriction.
The most up to date document definition is CHAT, it is also the richest in structure. Ideally, each group should develop a schema module defining the structure of their specific (class of) annotations, this schema should be an assembly of their definitions.
Developed by Romeo Anghelache, from the CHAT specifications,
released under the GNU Public License, 2001
structure of a CHAT document
@Participants; a structure enumerating the beings participating
a group of utterances having something in common, usually the speaker
31 March 1999 is formatted as 1999-03-31
this work might be done in an extended interval of time; a duration of 1 year, 2 months, 3 days, 10 hours, and 30 minutes, one would write: P1Y2M3DT10H30M
e.g. when you write russian words using english characters, then Lang="ru" and Script is "en"
an AIF document, see http://morph.ldc.upenn.edu/AG/doc/xml/
administrative descriptions, reused from Dublin Core
allows semi structured extensions to the current set of annotations:
these are the (legacy) dependent tiers, %mor line is, now, <morphemics> element
%add
%act
%alt
%cod; general purpose coding
%coh; cohesion tier
%com;[% text]; comments by investigator
[0 text]; an omitted word
%eng
%err; error coding
[%exc ...]
%exp; [= text]
%flo
%fac
%gls
%gpx
%int
%lan
%par:
%pho:
%mod:
[: text]
%def; on the main line, not recommended
%sit
%spa
%tim
arbitrary annotations, CHAT postcodes, intended as an extension mechanism
inlined annotations, the conventional CHAT symbols are listed too
[!]
[!!]
["]
[?] in CHAT, ( text ) in CA
[*]
[/] in CHAT
[//] in CHAT, - in CA
[///] in CHAT
[/?]
[/-]
quicker tempo, no CHAT equivalent, used in CA
slower tempo, no CHAT equivalent, used in CA
larger volume, louder, no CHAT equivalent, used in CA
lower volume, no CHAT equivalent, used in CA
[>]
[<]
non verbal happenings
scoped symbols
the place to add research content
xxx;yyy
www
0
hhh in CA
.hhh in CA
a not yet clearly cathegorized noise
clearing throat noise
smacking lips noise
intended as a feature of a word, see also the CHAT conventional notations
@ap
@b
@c
@cue
@d
@f
@fp
@g
@i
@inf
@ins
@l
@n
@nv
@o
@p
@pr
@s
@sc
@sas
@sl
@t
@u
@x
@wp
xx/yy
()
0word
0*word
00word
&; phonological fragment
a nonempty string
syntactic structure
the whole morpheme is actually a prefix, CHAT equivalent is ~#
category
subcategory
the unit of a %mor line corresponding to a word (this element belongs to a word element, but, if the precise correspondence is not yet established, these elements will be present at the utterance level (contained in an utterance);
a word
clitic or compound markers, may be used in morphemics and wordnet
a word
equivalent of CHAT symbol @;
the place to add research content
scoped symbols
one may attach a translation of a word;
an optional suffix
-?
-.
-!
-'.
-,.
-,
-_
-'
-
,,
-:
structure used to let annotations to belong to more than one word, can be recursive, although unnecessary: one can attach more than one annotations to a word, group of words, or whole utterances
a word
a construct formed by words linked through clitic or compound e.g. once+and+for+all
a reference to a point/portion of a mute/action signal, e.g. 0
semicolon , clause delimiter [c];
scoped symbols
the place to add research content
one may attach a translation of a word;
stress, blocking etc.
equivalent of CHAT symbol @;
the place to add research content
scoped symbols
one may attach a translation of a word;
an optional suffix
utterance initiators or linkers; they indicate the way to fit the current utterance with an earlier one, the CHAT conventional symbols are listed too
+"
+^
+<
+,
++
a pointer to a selection in a video/audio file
frame
second
byte
character
morphemes
prefix marker, CHAT equivalent is #
suffix marker, CHAT equivalent is -
suffix fusion marker, CHAT equivalent is &;
omitted affix, CHAT equivalent is -0
incorrectly omitted affix, CHAT equivalent is -0*
english translation, CHAT equivalent is =
morphological cathegory, CHAt equivalent is :, when used after the stem
the beings along with their characteristics (age, sex...)
/
//
///
:
::
^
*text* in CA
the # symbol, pause between words
the place to add research content
the place to add research content
[c] clause-delimiter;
period, question, exclamation; basic utterance terminator;
+...
+..?
+!?
+/.
+/?
+//.
+//?
+"/.
+".
the place to add research content
a word
a group of words
a construct formed by words linked through clitic or compound e.g. once+and+for+all
a reference to a point/portion of a mute/action signal, e.g. 0
semicolon, clause_delimiter [c];
scoped symbols
the place to add research content
compound, CHAT +
clitic, CHAT ~