What is URL Encoding?

URL Encoding

When the url is constructed, it might contain spaces, special characters, other characters like ‘&’, ‘?’, ‘<‘, ‘>’ etc. These characters should be handled correctly to form a valid and correct URL. If any URL contains special characters or those which are used in semantics of a programming language, these should be converted in a format that doesn’t conflict with the special characters semantically. URL encoding is done for this purpose. Once the encoding has been done and the parameters are evaluated as per the program logic, if any parameter/argument needs to be converted back to its original format, then it needs to be decoded. Conversion from character format to its hex equivalent is called URL encoding and the procedure to convert it back to it original format is called decoding.

If the URL is as follows:

str1: “proc+test”
str2: “This is a test string”

“http://www.virtualitworld.co.in/code/test.cgi?first=str1&second=str2”

It gets expanded to:
http://www.virtualitworld.co.in/code/test.cgi?first=proc%2Btest&second=This+is+a+test+string

Note that ‘+’ sign changes to ‘%2B’
and spaces in the string str2 changes with ‘+’ sign.

This procedure of converting strings to a valid URL string format is called URL Encoding.

Here is a list of hex codes as per the ASCII chart:

ASCII Chart of some special characters used in encoding

Dec Hex HTML Char
33 0x21 &#33; !
35 0x23 &#35; #
36 0x24 &#36; $
37 0x25 &#37; %
38 0x26 &#38; &
39 0x27 &#39;
40 0x28 &#40; (
41 0x29 &#41; )
42 0x2A &#42; *
43 0x2B &#43; +
44 0x2C &#44; ,
45 0x2D &#45;
46 0x2E &#46; .
47 0x2F &#47; /
58 0x3A &#58; :
60 0x3C &#60; <
61 0x3D &#61; =
62 0x3E &#62; >
63 0x3F &#63; ?
64 0x40 &#64; @
91 0x5B &#91; [
92 0x5C &#92; \
93 0x5D &#93; ]
94 0x5E &#94; ^
95 0x5F &#95; _
96 0x60 &#96; `
123 0x7B &#123; {
124 0x7C &#124; |
125 0x7D &#125; }
126 0x7E &#126; ~
128 0x80 &#128;
129 0x81 &#129; 
130 0x82 &#130;
131 0x83 &#131; ƒ
132 0x84 &#132;
133 0x85 &#133;
134 0x86 &#134;
135 0x87 &#135;
136 0x88 &#136; ˆ
137 0x89 &#137;
138 0x8A &#138; Š
139 0x8B &#139;
140 0x8C &#140; Œ
141 0x8D &#141; 
142 0x8E &#142; Ž
143 0x8F &#143; 
144 0x90 &#144; 
145 0x91 &#145;
146 0x92 &#146;
147 0x93 &#147;
148 0x94 &#148;
149 0x95 &#149;
150 0x96 &#150;
151 0x97 &#151;
152 0x98 &#152; ˜
153 0x99 &#153;
154 0x9A &#154; š
155 0x9B &#155;
156 0x9C &#156; œ
157 0x9D &#157; 
158 0x9E &#158; ž
159 0x9F &#159; Ÿ
161 0xA1 &#161; ¡
162 0xA2 &#162; ¢
163 0xA3 &#163; £
164 0xA4 &#164; ¤
165 0xA5 &#165; ¥
166 0xA6 &#166; ¦
167 0xA7 &#167; §
168 0xA8 &#168; ¨
169 0xA9 &#169; ©
170 0xAA &#170; ª
171 0xAB &#171; «
172 0xAC &#172; ¬
173 0xAD &#173; ­
174 0xAE &#174; ®
175 0xAF &#175; ¯
176 0xB0 &#176; °
177 0xB1 &#177; ±
178 0xB2 &#178; ²
179 0xB3 &#179; ³
180 0xB4 &#180; ´
181 0xB5 &#181; µ
182 0xB6 &#182;
183 0xB7 &#183; ·
184 0xB8 &#184; ¸
185 0xB9 &#185; ¹
186 0xBA &#186; º
187 0xBB &#187; »
188 0xBC &#188; ¼
189 0xBD &#189; ½
190 0xBE &#190; ¾
191 0xBF &#191; ¿
192 0xC0 &#192; À
193 0xC1 &#193; Á
194 0xC2 &#194; Â
195 0xC3 &#195; Ã
196 0xC4 &#196; Ä
197 0xC5 &#197; Å
198 0xC6 &#198; Æ
199 0xC7 &#199; Ç
200 0xC8 &#200; È
201 0xC9 &#201; É
202 0xCA &#202; Ê
203 0xCB &#203; Ë
204 0xCC &#204; Ì
205 0xCD &#205; Í
206 0xCE &#206; Î
207 0xCF &#207; Ï
208 0xD0 &#208; Ð
209 0xD1 &#209; Ñ
210 0xD2 &#210; Ò
211 0xD3 &#211; Ó
212 0xD4 &#212; Ô
213 0xD5 &#213; Õ
214 0xD6 &#214; Ö
215 0xD7 &#215; ×
216 0xD8 &#216; Ø
217 0xD9 &#217; Ù
218 0xDA &#218; Ú
219 0xDB &#219; Û
220 0xDC &#220; Ü
221 0xDD &#221; Ý
222 0xDE &#222; Þ
223 0xDF &#223; ß
224 0xE0 &#224; à
225 0xE1 &#225; á
226 0xE2 &#226; â
227 0xE3 &#227; ã
228 0xE4 &#228; ä
229 0xE5 &#229; å
230 0xE6 &#230; æ
231 0xE7 &#231; ç
232 0xE8 &#232; è
233 0xE9 &#233; é
234 0xEA &#234; ê
235 0xEB &#235; ë
236 0xEC &#236; ì
237 0xED &#237; í
238 0xEE &#238; î
239 0xEF &#239; ï
240 0xF0 &#240; ð
241 0xF1 &#241; ñ
242 0xF2 &#242; ò
243 0xF3 &#243; ó
244 0xF4 &#244; ô
245 0xF5 &#245; õ
246 0xF6 &#246; ö
247 0xF7 &#247; ÷
248 0xF8 &#248; ø
249 0xF9 &#249; ù
250 0xFA &#250; ú
251 0xFB &#251; û
252 0xFC &#252; ü
253 0xFD &#253; ý
254 0xFE &#254; þ
255 0xFF &#255; ÿ

 

In Perl programming language, to convert a string to encoding format, ie. to do URL encoding, from CGI.pm module, uri_escape() is used. To decode the string from hex-codes back to its original string format, uri_unescape() function is used.

For example:

use CGI;
my $str1: “proc%2Btest”
my $str2 = uri_unescape($str1);
print $str2;

Output: proc+test



Posts created 3

Begin typing your search term above and press enter to search. Press ESC to cancel.

Back To Top