HTML Functions - Encode and Decode
The HTML encode and decode functions can be used to encode/decode strings into/from HTML entities. These functions are essential when working with HTML content, preventing XSS attacks, and ensuring proper display of special characters in web applications.
HTML.encode()
Description:
Encodes plain text into HTML by converting characters into HTML entities.
Syntax:
HTML.encode(string)
Parameters:
string- The plain text string to encode
Returns: HTML-encoded string with special characters converted to entities
Examples:
HTML.encode('One & Two')
// Returns: "One & Two"
HTML.encode('<div>Hello</div>')
// Returns: "<div>Hello</div>"
HTML.encode('Price: $50 < $100')
// Returns: "Price: $50 < $100"
HTML.encode('Copyright © 2024')
// Returns: "Copyright © 2024"
HTML.encode('"Quote" & \'Apostrophe\'')
// Returns: ""Quote" & 'Apostrophe'"
HTML.decode()
Description:
Decodes HTML into plain text by converting HTML entities into characters.
Syntax:
HTML.decode(string)
Parameters:
string- The HTML-encoded string to decode
Returns: Plain text string with HTML entities converted to characters
Examples:
HTML.decode('One & Two')
// Returns: "One & Two"
HTML.decode('<div>Hello</div>')
// Returns: "<div>Hello</div>"
HTML.decode('Price: $50 < $100')
// Returns: "Price: $50 < $100"
HTML.decode('Copyright © 2024')
// Returns: "Copyright © 2024"
HTML.decode('"Quote" & 'Apostrophe'')
// Returns: '"Quote" & \'Apostrophe\''
Common HTML Entities
| Character | HTML Entity | Description |
|---|---|---|
| < | < | Less than |
| > | > | Greater than |
| & | & | Ampersand |
| " | " | Double quote |
| ' | ' or ' | Single quote/Apostrophe |
| (space) | | Non-breaking space |
| © | © | Copyright |
| ® | ® | Registered trademark |
| ™ | ™ | Trademark |
Common Use Cases
Preventing XSS Attacks:
// Sanitize user input before displaying in HTML
{
"displayName": HTML.encode($userInput),
"safeToRender": true
}
Encoding HTML Content for APIs:
// Encode HTML content before sending to API
{
"content": HTML.encode($htmlContent),
"format": "html"
}
Decoding HTML from API Response:
// Decode HTML entities from API response
{
"title": HTML.decode($response.title),
"description": HTML.decode($response.description)
}
Safe HTML Display:
// Encode user-generated content before displaying
HTML.encode($userComment)
// Prevents script injection: <script>alert('XSS')</script>
// Becomes: <script>alert('XSS')</script>
Processing Email Content:
// Encode plain text for HTML email
{
"subject": $subject,
"body": HTML.encode($plainTextBody),
"contentType": "text/html"
}
Round-Trip Encoding/Decoding:
// Encode then decode should return original
HTML.decode(HTML.encode("Test <tag>"))
// Returns: "Test <tag>"
Handling URLs in HTML:
// Encode URL parameters containing special characters
{
"url": "/search?q=" + HTML.encode($searchQuery),
"safe": true
}
Security Considerations
XSS Prevention:
- Always Encode User Input: Always use HTML.encode() on user-generated content before displaying it in HTML.
- Defense in Depth: HTML encoding is one layer of security; also validate and sanitize input.
- Context Matters: HTML encoding prevents XSS in HTML content but may not be sufficient for JavaScript contexts.
Example - Unsafe vs Safe:
// UNSAFE - User input directly in HTML
"<div>" + $userInput + "</div>"
// If $userInput = "<script>alert('XSS')</script>"
// Results in: <div><script>alert('XSS')</script></div>
// SAFE - Encoded user input
"<div>" + HTML.encode($userInput) + "</div>"
// If $userInput = "<script>alert('XSS')</script>"
// Results in: <div><script>alert('XSS')</script></div>
Best Practices
- Encode Output: Always encode user input when displaying it in HTML contexts.
- Decode Input: Decode HTML-encoded data when you need to process or store the original text.
- Validate First: Validate input data before encoding to ensure data quality.
- Context-Appropriate: Use HTML encoding for HTML contexts; use different encoding for URLs, JavaScript, etc.
- Preserve Data Integrity: Store data in its original form and encode only when rendering.
- Test Edge Cases: Test with special characters, international characters, and edge cases.
- Don't Double-Encode: Avoid encoding already-encoded content, which can lead to &amp; issues.
- Round-Trip Validation: Test that decode(encode(x)) equals x for your data.
Comparison with Other Encoding Functions
| Function | Purpose | Input Example | Output Example |
|---|---|---|---|
HTML.encode() |
HTML entity encoding | <div> | <div> |
encodeURIComponent() |
URL encoding | hello world | hello%20world |
Base64.encode() |
Base64 encoding | Hello | SGVsbG8= |
iconv.encode() |
Character set encoding | Text | Byte array |
Practical Examples
Building HTML Email:
{
"to": $recipient,
"subject": "Welcome",
"body": "<html><body>" +
"<h1>" + HTML.encode($name) + "</h1>" +
"<p>" + HTML.encode($message) + "</p>" +
"</body></html>"
}
Processing Form Data:
// Encode form data for display
{
"formData": {
"name": HTML.encode($form.name),
"email": HTML.encode($form.email),
"comment": HTML.encode($form.comment)
}
}
Generating HTML Table:
// Build HTML table with encoded content
$data.map(row =>
"<tr>" +
"<td>" + HTML.encode(row.name) + "</td>" +
"<td>" + HTML.encode(row.value) + "</td>" +
"</tr>"
).join("")
Safe Dynamic HTML Generation:
// Generate HTML with user content
"<div class='user-content'>" +
"<h2>" + HTML.encode($title) + "</h2>" +
"<p>" + HTML.encode($description) + "</p>" +
"<a href='" + encodeURIComponent($link) + "'>" +
HTML.encode($linkText) +
"</a>" +
"</div>"
Processing RSS/Atom Feeds:
// Decode HTML entities from feed content
{
"title": HTML.decode($feed.title),
"description": HTML.decode($feed.description),
"content": HTML.decode($feed.content)
}
Sanitizing Rich Text:
// First encode all HTML, then decode safe entities
{
"sanitized": HTML.encode($richText)
.replace("&nbsp;", " ")
.replace("&copy;", "©")
}
Function Reference
| Function | Input | Output | Use Case |
|---|---|---|---|
HTML.encode() |
Plain text | HTML entities | Display user input safely |
HTML.decode() |
HTML entities | Plain text | Process encoded content |