fix erroneous acceptance of f4 9x xx xx code sequences by utf-8 decoder
authorRich Felker <dalias@aerifal.cx>
Fri, 1 Sep 2017 21:05:40 +0000 (17:05 -0400)
committerRich Felker <dalias@aerifal.cx>
Fri, 1 Sep 2017 21:05:40 +0000 (17:05 -0400)
the DFA table controlling accepted ranges for the f4 prefix used an
incorrect upper bound of 0xa0 where it should have been 0x90, allowing
such sequences to be accepted and decoded as non-Unicode-scalar values
0x110000 through 0x11ffff.

src/multibyte/internal.c

index 7e1b1c0..2f5aaa9 100644 (file)
@@ -9,7 +9,7 @@
              | x )
 #define F(x) ( ( x>=5 ? 0 : \
                  x==0 ? R(0x90,0xc0) : \
-                 x==4 ? R(0x80,0xa0) : \
+                 x==4 ? R(0x80,0x90) : \
                  R(0x80,0xc0) ) \
              | ( R(0x80,0xc0) >> 6 ) \
              | ( R(0x80,0xc0) >> 12 ) \