HTML escaping isn't magic—it's just a smart substitution that changes dangerous characters into safe codes. Think of it like writing a note about fire: "This paper is <flammable>" instead of "This paper is <flammable>". The browser reads the entities but doesn't act on them.
The Core Five Characters
These are the characters that cause real trouble in HTML:
The escape character itself needs escaping first
Starts HTML tags and scripts
Closes HTML tags and scripts
Used in HTML attributes
Alternative for attribute values
The Order Matters
You can't escape characters randomly. The ampersand must be escaped first—otherwise, you'd create broken entities. It's like cleaning a dirty window: you wipe the big smudges before polishing the details.
Right way: Escape & first, then < and >
Wrong way: Escape < first, then & becomes & but the < might already be processed
What About Other Characters?
You might see tools escaping forward slashes (/), equals signs (=), or even spaces. These usually aren't necessary for basic security. I've worked on codebases where developers escaped everything "just to be safe," only to create display issues with international characters or emojis. Stick with the five core characters unless you have a specific reason to escape others.