python - Why do numbers in a string become "x0n" when a backslash precedes them? -
i doing few experiments escape backslashes in python 3.4 shell , noticed quite strange.
>>> string = "\test\test\1\2\3" >>> string '\test\test\x01\x02\x03' >>> string = "5" >>> string '5' >>> string = "5\6\7" >>> string '5\x06\x07'
as can see in above code, defined variable string "\test\test\1\2\3"
. however, when entered string
in console, instead of printing "\test\test\1\2\3"
, printed "\test\test\x01\x02\x03"
. why occur, , used for?
in python string literals, \
character starts escape sequences. \n
translates newline character, \t
tab, etc. \xhh
hex sequences let produce codepoints hex values instead, \uhhhh
produce codepoints 4-digit hex values, , \uhhhhhhhh
produce codepoints 8-digit hex values.
see string , bytes literals documentation, contains table of possible escape sequences.
when python echoes string object in interpreter (or use repr()
function on string object), python creates representation of string value. representation happens use exact same python string literal syntax, make easier debug values, can use representation recreate exact same value.
to keep non-printable characters either causing havoc or not shown @ all, python uses same escape sequence syntax represent characters. bytes not printable represented using suitable \xhh
sequences, or if possible, 1 of \c
single letter escapes (so newlines shown \n
).
in example, created non-printable bytes using \ooo
octal value escape sequence syntax. digits interpreted octal number create corrensponding codepoint. when echoing string value back, default \xhh
syntax used represent exact same value in hexadecimal:
>>> '\20' # octal 16 '\x10'
while \t
became tab character:
>>> print('\test') est
note how there no letter t
there; instead, remaining est
indented whitespace, horizontal tab.
if need include literal \
backslash characters need double character:
>>> '\\test\\1\\2\\3' '\\test\\1\\2\\3' >>> print('\\test\\1\\2\\3') \test\1\2\3 >>> len('\\test\\1\\2\\3') 11
note representation used doubled backslashes! if didn't, you'd not able copy string , paste python recreate value. using print()
write value terminal actual characters (and not string representation) shows there single backslashes there, , taking length shows have 11 characters in string, not 15.
you can use raw string literal. that's different syntax, string objects created syntax exact same type, same value. different way of spelling out string values. in raw string literal, backslashes backslashes, long not last character in string; escape sequences not work in raw string literal:
>>> r'\test\1\2\3' '\\test\\1\\2\\3'
last not least, if creating strings represent filenames on windows system, use forward slashes; apis in window don't mind , accept both types of slash separators in filename:
>>> 'c:/this/is/a/valid/path' 'c:/this/is/a/valid/path'
Comments
Post a Comment