@Azure that sounds like a good concern; but I guess now it's too late, as they might have already closed the path for an incremental and backwards-compatible transition.
@lyrabon I remember reading about diacritic on latin words: "á" and "<a><composite_acute>" being two binary representations for the same data. Needless to say how not-interoperable that is. Maybe UTF-8 retaining ASCII backwards compatibility was a mistake.