LogicDeLuxe: I tried your method (inserting the extra speech) but with the Cobb dialogue I tried, the two sections end up overlapping, because the subtitles move on before Cobb is done with the first bit.
I don't know anything about the lip sync entries. I don't think scummspeaks generates any meaningful data for them. Maybe tweaking those could solve this issue.

Combining samples is an option with slightly off subtitle sync, right. Another side effect would be the need for some duplicated. There are some multi-sample-lines which have only some of the samples in common with other lines. Those needs to be duplicated with this method. Those aren't too many lines, though.

The best thing would be a method which automatically could decompile the complete game, patch the scripts to divide those multi-sample-lines in individual speak events, and recompile the entire game.

Too bad, there is no such convenient tool like Deutex for Doom, which not only extracts all resources in individual files (like scummtr does), but also converts them to usable formats and back so that a wide base of existing software can be used to modify them.
