HTML Functions - Encode and Decode

The HTML encode and decode functions can be used to encode/decode strings into/from HTML entities. These functions are essential when working with HTML content, preventing XSS attacks, and ensuring proper display of special characters in web applications.

HTML.encode()

Description:

Encodes plain text into HTML by converting characters into HTML entities.

Syntax:

HTML.encode(string)

Parameters:

  • string - The plain text string to encode

Returns: HTML-encoded string with special characters converted to entities

Examples:

HTML.encode('One & Two')
// Returns: "One & Two"

HTML.encode('<div>Hello</div>')
// Returns: "&lt;div&gt;Hello&lt;/div&gt;"

HTML.encode('Price: $50 < $100')
// Returns: "Price: $50 &lt; $100"

HTML.encode('Copyright © 2024')
// Returns: "Copyright &copy; 2024"

HTML.encode('"Quote" & \'Apostrophe\'')
// Returns: "&quot;Quote&quot; &amp; &#39;Apostrophe&#39;"

HTML.decode()

Description:

Decodes HTML into plain text by converting HTML entities into characters.

Syntax:

HTML.decode(string)

Parameters:

  • string - The HTML-encoded string to decode

Returns: Plain text string with HTML entities converted to characters

Examples:

HTML.decode('One &amp; Two')
// Returns: "One & Two"

HTML.decode('&lt;div&gt;Hello&lt;/div&gt;')
// Returns: "<div>Hello</div>"

HTML.decode('Price: $50 &lt; $100')
// Returns: "Price: $50 < $100"

HTML.decode('Copyright &copy; 2024')
// Returns: "Copyright © 2024"

HTML.decode('&quot;Quote&quot; &amp; &#39;Apostrophe&#39;')
// Returns: '"Quote" & \'Apostrophe\''

Common HTML Entities

Character HTML Entity Description
< &lt; Less than
> &gt; Greater than
& &amp; Ampersand
" &quot; Double quote
' &#39; or &apos; Single quote/Apostrophe
(space) &nbsp; Non-breaking space
© &copy; Copyright
® &reg; Registered trademark
&trade; Trademark

Common Use Cases

Preventing XSS Attacks:

// Sanitize user input before displaying in HTML
{
  "displayName": HTML.encode($userInput),
  "safeToRender": true
}

Encoding HTML Content for APIs:

// Encode HTML content before sending to API
{
  "content": HTML.encode($htmlContent),
  "format": "html"
}

Decoding HTML from API Response:

// Decode HTML entities from API response
{
  "title": HTML.decode($response.title),
  "description": HTML.decode($response.description)
}

Safe HTML Display:

// Encode user-generated content before displaying
HTML.encode($userComment)
// Prevents script injection: <script>alert('XSS')</script>
// Becomes: &lt;script&gt;alert(&#39;XSS&#39;)&lt;/script&gt;

Processing Email Content:

// Encode plain text for HTML email
{
  "subject": $subject,
  "body": HTML.encode($plainTextBody),
  "contentType": "text/html"
}

Round-Trip Encoding/Decoding:

// Encode then decode should return original
HTML.decode(HTML.encode("Test <tag>"))
// Returns: "Test <tag>"

Handling URLs in HTML:

// Encode URL parameters containing special characters
{
  "url": "/search?q=" + HTML.encode($searchQuery),
  "safe": true
}

Security Considerations

XSS Prevention:

  • Always Encode User Input: Always use HTML.encode() on user-generated content before displaying it in HTML.
  • Defense in Depth: HTML encoding is one layer of security; also validate and sanitize input.
  • Context Matters: HTML encoding prevents XSS in HTML content but may not be sufficient for JavaScript contexts.

Example - Unsafe vs Safe:

// UNSAFE - User input directly in HTML
"<div>" + $userInput + "</div>"
// If $userInput = "<script>alert('XSS')</script>"
// Results in: <div><script>alert('XSS')</script></div>

// SAFE - Encoded user input
"<div>" + HTML.encode($userInput) + "</div>"
// If $userInput = "<script>alert('XSS')</script>"
// Results in: <div>&lt;script&gt;alert('XSS')&lt;/script&gt;</div>

Best Practices

  • Encode Output: Always encode user input when displaying it in HTML contexts.
  • Decode Input: Decode HTML-encoded data when you need to process or store the original text.
  • Validate First: Validate input data before encoding to ensure data quality.
  • Context-Appropriate: Use HTML encoding for HTML contexts; use different encoding for URLs, JavaScript, etc.
  • Preserve Data Integrity: Store data in its original form and encode only when rendering.
  • Test Edge Cases: Test with special characters, international characters, and edge cases.
  • Don't Double-Encode: Avoid encoding already-encoded content, which can lead to &amp;amp; issues.
  • Round-Trip Validation: Test that decode(encode(x)) equals x for your data.

Comparison with Other Encoding Functions

Function Purpose Input Example Output Example
HTML.encode() HTML entity encoding <div> &lt;div&gt;
encodeURIComponent() URL encoding hello world hello%20world
Base64.encode() Base64 encoding Hello SGVsbG8=
iconv.encode() Character set encoding Text Byte array

Practical Examples

Building HTML Email:

{
  "to": $recipient,
  "subject": "Welcome",
  "body": "<html><body>" +
          "<h1>" + HTML.encode($name) + "</h1>" +
          "<p>" + HTML.encode($message) + "</p>" +
          "</body></html>"
}

Processing Form Data:

// Encode form data for display
{
  "formData": {
    "name": HTML.encode($form.name),
    "email": HTML.encode($form.email),
    "comment": HTML.encode($form.comment)
  }
}

Generating HTML Table:

// Build HTML table with encoded content
$data.map(row =>
  "<tr>" +
    "<td>" + HTML.encode(row.name) + "</td>" +
    "<td>" + HTML.encode(row.value) + "</td>" +
  "</tr>"
).join("")

Safe Dynamic HTML Generation:

// Generate HTML with user content
"<div class='user-content'>" +
  "<h2>" + HTML.encode($title) + "</h2>" +
  "<p>" + HTML.encode($description) + "</p>" +
  "<a href='" + encodeURIComponent($link) + "'>" +
    HTML.encode($linkText) +
  "</a>" +
"</div>"

Processing RSS/Atom Feeds:

// Decode HTML entities from feed content
{
  "title": HTML.decode($feed.title),
  "description": HTML.decode($feed.description),
  "content": HTML.decode($feed.content)
}

Sanitizing Rich Text:

// First encode all HTML, then decode safe entities
{
  "sanitized": HTML.encode($richText)
    .replace("&amp;nbsp;", "&nbsp;")
    .replace("&amp;copy;", "&copy;")
}

Function Reference

Function Input Output Use Case
HTML.encode() Plain text HTML entities Display user input safely
HTML.decode() HTML entities Plain text Process encoded content