What characters need to be encoded and why?
How are characters URL encoded?
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character.
| ASCII Control characters | ||
| Why: | These characters are not printable. | |
| Characters: | Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.) | |
| Non-ASCII characters | ||
| Why: | These are by definition not legal in URLs since they are not in the ASCII set. | |
| Characters: | Includes the entire "top half" of the ISO-Latin set 80-FF hex (128-255 decimal.) | |
| "Reserved characters" | |||||||
| Why: | URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded. | ||||||
| Characters: |
| ||||||
| "Unsafe characters" | |||||||||||||||||||||||||
| Why: | Some characters present the possibility of being misunderstood within URLs for various reasons. These characters should also always be encoded. | ||||||||||||||||||||||||
| Characters: |
| ||||||||||||||||||||||||
How are characters URL encoded?
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character.
- Example
- Space = decimal code point 32 in the ISO-Latin set.
- 32 decimal = 20 in hexadecimal
- The URL encoded representation will be "%20"
Reference: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
No comments:
Post a Comment