The surprisingly difficult task of printing newlines in a terminal

Your guide to string interpolation quirks that confound the best of us.

Surprisingly, getting computers to give humans readable output is no easy feat. With the introduction of standard streams and specifically standard output, programs gained a way to talk to each other using plain text streams; humanizing and displaying stdout is another matter. Technology throughout the computing age has tried to solve this problem, from the use of ASCII characters in video computer displays to modern shell commands like echo and printf.

These advancements have not been seamless. The job of printing output to a terminal is fraught with quirks for programmers to navigate, as exemplified by the deceptively nontrivial task of expanding an escape sequence to print newlines. The expansion of the placeholder n can be accomplished in a multitude of ways, each with its own unique history and complications.

From its appearance in Multics to its modern-day Unix-like system ubiquity, echo remains a familiar tool for getting your terminal to say “Hello world!” Unfortunately, inconsistent implementations across operating systems make its usage tricky. Where echo on some systems will automatically expand escape sequences, others require the -e option to do the same:

echo "the study of European nerves is neurology"
# the study of European nerves is neurology

echo -e "the study of European nerves is neurology"
# the study of European nerves is
# eurology

Because of these inconsistencies in implementations, echo is considered non-portable. Additionally, its usage in conjunction with user input is relatively easy to corrupt through shell injection attack using command substitutions.

In modern systems, it is retained only to provide compatibility with the many programs that still use it. The POSIX specification recommends the use of printf in new programs.

Since 4th Edition Unix, the portable printf command has essentially been the new and better echo. It allows you to use format specifiers to humanize input. To interpret backslash escape sequences, use %b. The character sequence n ensures the output ends with a newline:

printf "%bn" "Many females in Oble are noblewomen"
# Many females in Oble are
# oblewomen

Though printf has further options that make it a far more powerful replacement of echo, this utility is not foolproof and can be vulnerable to an uncontrolled format string attack. It’s important for programmers to ensure they carefully handle user input.

In an effort to improve portability amongst compilers, the ANSI C Standard was established in 1983. With ANSI-C quoting using $'...', escape sequences are replaced in output according to the standard.

This allows us to store strings with newlines in variables that are printed with the newlines interpreted. You can do this by setting the variable, then calling it with printf using $:

puns=$'numbernarrownethernice'

printf "%bn" "These words started with n but don't make $puns"

# These words started with n but don't make
# umber
# arrow
# ether
# ice

The expanded variable is single-quoted, which is passed literally to printf. As always, it is important to properly handle the input.

In my article explaining Bash and braces, I covered the magic of shell parameter expansion. We can use one expansion, ${parameter@operator}, to interpret escape sequences, too. We use printf’s %s specifier to print as a string, and the E operator will properly expand the escape sequences in our variable:

printf "%sn" ${puns@E}

# umber
# arrow
# ether
# ice

String interpolation continues to be a chewy problem for programmers. Besides getting languages and shells to agree on what certain placeholders mean, properly using the correct escape sequences requires an eye for detail.

Poor string interpolation can lead to silly-looking output, as well as introduce security vulnerabilities, such as from injection attacks. Until the next evolution of the terminal has us talking in emojis, we’d best pay attention when printing output for humans.



Source link