python - Why do numbers in a string become "x0n" when a backslash precedes them? -


i doing few experiments escape backslashes in python 3.4 shell , noticed quite strange.

>>> string = "\test\test\1\2\3" >>> string '\test\test\x01\x02\x03' >>> string = "5" >>> string '5' >>> string = "5\6\7" >>> string '5\x06\x07' 

as can see in above code, defined variable string "\test\test\1\2\3". however, when entered string in console, instead of printing "\test\test\1\2\3", printed "\test\test\x01\x02\x03". why occur, , used for?

in python string literals, \ character starts escape sequences. \n translates newline character, \t tab, etc. \xhh hex sequences let produce codepoints hex values instead, \uhhhh produce codepoints 4-digit hex values, , \uhhhhhhhh produce codepoints 8-digit hex values.

see string , bytes literals documentation, contains table of possible escape sequences.

when python echoes string object in interpreter (or use repr() function on string object), python creates representation of string value. representation happens use exact same python string literal syntax, make easier debug values, can use representation recreate exact same value.

to keep non-printable characters either causing havoc or not shown @ all, python uses same escape sequence syntax represent characters. bytes not printable represented using suitable \xhh sequences, or if possible, 1 of \c single letter escapes (so newlines shown \n).

in example, created non-printable bytes using \ooo octal value escape sequence syntax. digits interpreted octal number create corrensponding codepoint. when echoing string value back, default \xhh syntax used represent exact same value in hexadecimal:

>>> '\20' # octal 16 '\x10' 

while \t became tab character:

>>> print('\test')     est 

note how there no letter t there; instead, remaining est indented whitespace, horizontal tab.

if need include literal \ backslash characters need double character:

>>> '\\test\\1\\2\\3' '\\test\\1\\2\\3' >>> print('\\test\\1\\2\\3') \test\1\2\3 >>> len('\\test\\1\\2\\3') 11 

note representation used doubled backslashes! if didn't, you'd not able copy string , paste python recreate value. using print() write value terminal actual characters (and not string representation) shows there single backslashes there, , taking length shows have 11 characters in string, not 15.

you can use raw string literal. that's different syntax, string objects created syntax exact same type, same value. different way of spelling out string values. in raw string literal, backslashes backslashes, long not last character in string; escape sequences not work in raw string literal:

>>> r'\test\1\2\3' '\\test\\1\\2\\3' 

last not least, if creating strings represent filenames on windows system, use forward slashes; apis in window don't mind , accept both types of slash separators in filename:

>>> 'c:/this/is/a/valid/path' 'c:/this/is/a/valid/path' 

Comments

Popular posts from this blog

javascript - gulp-nodemon - nodejs restart after file change - Error: listen EADDRINUSE events.js:85 -

Fatal Python error: Py_Initialize: unable to load the file system codec. ImportError: No module named 'encodings' -

oracle - Changing start date for system jobs related to automatic statistics collections in 11g -