Garbage character in HTTP get request for arabic input

(Imad Jundi) #1

I have setup a bot that will ask the user for the product he is interested in and will query the matching products from postgresql database using http request servive.
it woking fine for english name like GR5 2007
but when the input is arabic i get the following in the log and not result is returned also there are many products matching the user input.

when the input is english (GR52017)
I get the correct result and it’s showing all products matching the GR52017
[22/Sep/2017 08:16:40] “GET /ads/777/GR52017/ HTTP/1.1” 200 1179

but when the input is arabic (هاتف)
i get the following garbage
[22/Sep/2017 08:29:57] code 400, message Bad request syntax (‘GET /ads/777/غرÙ\x81Ø©%20Ù\x86Ù\x88Ù\x85/ HTTP/1.1’)
[22/Sep/2017 08:29:57] “GET /ads/777/غرÙ�Ø©%20Ù�Ù�Ù�/ HTTP/1.1” 400 -

(John Jackson) #2

I think the problem here is that URL’s may only contain ASCII characters - and you’re feeding in a UTF-8 string that contains Arabic characters.

So we try and convert those characters to ASCII - badly of course because it’s impossible.

The body of a request can safely contain UTF-8 characters, so does the API you’re using support sending the payload in the body?

1 Like
(Imad Jundi) #3

Looks like this is the root cause of my issue.
Yes the body and payload support utf-8 and i can post data in arabic format.

Thanks

(Imad Jundi) #4

is there any workaround for this?

(Sarah Palombo) #5

@Jundi

Have you tried posting the data with the details in the body of your request. (not in the URL) ?

As John says, the body of a request can safely contain UTF-8 characters so you could next try that

Hope that helps

Sarah.