Originally Posted by jott
My (unconfirmed) guess is, that the first int fields in the speech.info are some sort of CRC/hash that are used as a lookup for the original strings.
From what I could gather, the header before each "group of cues" entry in the speech.info file has the following fields.
- 4 bytes that are some kind of CRC or otherwise uniquely indexing a spoken line (in english). This means that the cues: "Guybrush Threepwood.", "Guybrush THREEPWOOD", "Guybrush Threepwood!" will all have the same value for this field, even though they are spoken from different characters, and have differences in pronounciation. I really have trouble understanding the reason for this field, but I bet that this is what causes problems with fansubs.
- 2 bytes that index the room where the cue is spoken
- 2 bytes that are some kind of index for the "interaction" where the cue is spoken (e.g. same value if guybrush interacts with the same object, same value throughout one conversation)
- 2 bytes that somehow index the cue within the conversation or the room -I am not sure.
- 2 bytes that incementally index a "subcue" (if any) within a single cue. Subcues are seperated by the escape sequence \255\003 in the original text. (value 00 for the first subcue, 01 for the second etc.)
- 2 bytes: that have the value 01 if the line is spoken by Guybrush, and 00 else.
- 2 bytes: of unknown purpose. They almost always seem to be 00, but there are cases where they have other value.