qapi: Support (subset of) \u escapes in strings

The handling of \ inside QAPI strings was less than ideal, and
really only worked JSON's \/, \\, \", and our extension of \'
(an obvious extension, when you realize we use '' instead of ""
for strings).  For other things, like '\n', it resulted in a
literal 'n' instead of a newline.

Of course, at the moment, we really have no use for escaped
characters, as QAPI has to map to C identifiers, and we currently
support ASCII only for that.  But down the road, we may add
support for default values for string parameters to a command
or struct; if that happens, it would be nice to correctly support
all JSON escape sequences, such as \n or \uXXXX.  This gets us
closer, by supporting Unicode escapes in the ASCII range.

Since JSON does not require \OCTAL or \xXX escapes, and our QMP
implementation does not understand them either, I intentionally
reject it here, but it would be an easy addition if we desired it.
Likewise, intentionally refusing the NUL byte means we don't have
to worry about C strings being shorter than the qapi input.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
master
Eric Blake 2015-05-04 09:05:36 -06:00 committed by Markus Armbruster
parent 363b4262a1
commit a7f5966b29
26 changed files with 66 additions and 4 deletions

View File

@ -173,7 +173,41 @@ class QAPISchema:
raise QAPISchemaError(self,
'Missing terminating "\'"')
if esc:
string += ch
if ch == 'b':
string += '\b'
elif ch == 'f':
string += '\f'
elif ch == 'n':
string += '\n'
elif ch == 'r':
string += '\r'
elif ch == 't':
string += '\t'
elif ch == 'u':
value = 0
for x in range(0, 4):
ch = self.src[self.cursor]
self.cursor += 1
if ch not in "0123456789abcdefABCDEF":
raise QAPISchemaError(self,
'\\u escape needs 4 '
'hex digits')
value = (value << 4) + int(ch, 16)
# If Python 2 and 3 didn't disagree so much on
# how to handle Unicode, then we could allow
# Unicode string defaults. But most of QAPI is
# ASCII-only, so we aren't losing much for now.
if not value or value > 0x7f:
raise QAPISchemaError(self,
'For now, \\u escape '
'only supports non-zero '
'values up to \\u007f')
string += chr(value)
elif ch in "\\/'\"":
string += ch
else:
raise QAPISchemaError(self,
"Unknown escape \\%s" %ch)
esc = False
elif ch == "\\":
esc = True

View File

@ -212,6 +212,8 @@ check-qapi-schema-y := $(addprefix tests/qapi-schema/, \
enum-clash-member.json enum-max-member.json enum-union-clash.json \
enum-bad-name.json funny-char.json indented-expr.json \
missing-type.json bad-ident.json ident-with-escape.json \
escape-outside-string.json unknown-escape.json \
escape-too-short.json escape-too-big.json unicode-str.json \
double-type.json bad-base.json bad-type-bool.json bad-type-int.json \
bad-type-dict.json double-data.json unknown-expr-key.json \
redefined-type.json redefined-command.json redefined-builtin.json \

View File

@ -0,0 +1 @@
tests/qapi-schema/escape-outside-string.json:3:27: Stray "\"

View File

@ -0,0 +1 @@
1

View File

@ -0,0 +1,3 @@
# escape sequences are permitted only inside strings
# { 'command': 'foo', 'data': {} }
{ 'command': 'foo', 'data'\u003a{} }

View File

@ -0,0 +1 @@
tests/qapi-schema/escape-too-big.json:3:14: For now, \u escape only supports non-zero values up to \u007f

View File

@ -0,0 +1 @@
1

View File

@ -0,0 +1,3 @@
# we don't support full Unicode strings, yet
# { 'command': 'é' }
{ 'command': '\u00e9' }

View File

View File

@ -0,0 +1 @@
tests/qapi-schema/escape-too-short.json:3:14: \u escape needs 4 hex digits

View File

@ -0,0 +1 @@
1

View File

@ -0,0 +1,3 @@
# the \u escape requires 4 hex digits
# { 'command': 'a' }
{ 'command': '\u61' }

View File

View File

@ -1 +0,0 @@
tests/qapi-schema/ident-with-escape.json:3: Expression is missing metatype

View File

@ -1 +1 @@
1
0

View File

@ -1,4 +1,4 @@
# FIXME: we should allow escape sequences in strings, if they map back to ASCII
# we allow escape sequences in strings, if they map back to ASCII
# { 'command': 'fooA', 'data': { 'bar1': 'str' } }
{ 'c\u006fmmand': '\u0066\u006f\u006FA',
'd\u0061ta': { '\u0062\u0061\u00721': '\u0073\u0074\u0072' } }

View File

@ -0,0 +1,3 @@
[OrderedDict([('command', 'fooA'), ('data', OrderedDict([('bar1', 'str')]))])]
[]
[]

View File

@ -0,0 +1 @@
tests/qapi-schema/unicode-str.json:2: 'command' uses invalid name 'é'

View File

@ -0,0 +1 @@
1

View File

@ -0,0 +1,2 @@
# we don't support full Unicode strings, yet
{ 'command': 'é' }

View File

View File

@ -0,0 +1 @@
tests/qapi-schema/unknown-escape.json:3:21: Unknown escape \x

View File

@ -0,0 +1 @@
1

View File

@ -0,0 +1,3 @@
# we only recognize JSON escape sequences, plus our \' extension (no \x)
# { 'command': 'foo', 'data': {} }
{ 'command': 'foo', 'dat\x61':{} }

View File