NetTalk Central

Author Topic: UTF8 coming in to API Server  (Read 1101 times)

ProperGary

  • Newbie
  • *
  • Posts: 19
    • View Profile
    • Email
UTF8 coming in to API Server
« on: November 23, 2022, 06:49:10 AM »
Hi All,

I have an API server that is receiving UTF8 data and putting it into a queue, I have NetTalk (Server version 12.49) set to save as Windows-1252 (see attached image)

My issue is that strings are not getting converted and if I try and do manually that does not work either (I changed NetTalk back to Save as UTF8, before trying) :
Code: [Select]
stSave=incomingQueue.fulltitle
iAnsiLen=LEN(CLIP(stSave))
incomingQueue.fulltitle=str.Utf8ToAnsi(incomingQueue.fulltitle,iAnsiLen,st:CP_WINDOWS_1252)
and
Code: [Select]
     str.SetValue(incomingQueue.fulltitle,TRUE)
     str.ToAnsi(st:EncodeUtf8,st:CP_WINDOWS_1252)
     str.GetValue()

StringTheory version 3.53

The String in JSON sent from Postman looked like this:
Code: [Select]
"fulltitle":"A1. Haremlik / A2. Istanbul / A3. Kar\u0131\u015Ft\u0131rma Kuklal\u0131\u011F\u0131n\u0131n \u0130plerini / A4. Hayda Bre / B1. Otoban / B2. Ich Bin Ein Auslander / B3. Arabesk",
In Clarion I get
Code: [Select]
A1. Haremlik / A2. Istanbul / A3. Kar??t?rma Kuklal???n?n ?plerini / A4. Hayda Bre / B1. Otoban / B2. Ich Bin Ein Auslander / B3. Arabesk
Any suggestions as to how I can convert / resolve?

Gary


Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11167
    • View Profile
Re: UTF8 coming in to API Server
« Reply #1 on: November 23, 2022, 09:34:41 PM »
bear in mind that unicode contains a lot more characters than are available in Windows-1252. So if there's no mapping from the chars to your code, then you'll typically get ?'s during the conversion.

In other words converting from unicode to Windows-1252 (or any code-page) risks losing information. That's simply how that works.

In this case the JSON is Json encoded. They're actually sending you ASCII text, but encoding extended characters as \uXXXX format. (see json.org for more on that.)

In your case you're getting what appears to be Turkish characters (\u0131 is a dotsless I ) which don't map to Windows-1252.

Incidenatally you don't need to do the conversion yourself - it's done for you by jFiles if
json.SetJsonDecode(true)
which is the default.

the code page to use when converting "from" is automatically fetched from your system{prop:charset}.

Cheers
Bruce

ProperGary

  • Newbie
  • *
  • Posts: 19
    • View Profile
    • Email
Re: UTF8 coming in to API Server
« Reply #2 on: December 06, 2022, 10:00:11 AM »
Hi Bruce,

Thanks for your response, sorry to be so slow in replying, I was pulled onto other priorities!

We are moving the database away from Windows-1252, but that will be some time early(ish) next year, in the mean time we wanted to look at keeping the JSON encoding and either manually converting or storing as JSON and converting when displayed.

To that end, I added self.SetJsonDecode(FALSE) to the json | Construct() embed point of the NetWebServiceMethod, but the strings are still being decoded, should this be elsewhere?

Gary

Bruce

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 11167
    • View Profile
Re: UTF8 coming in to API Server
« Reply #3 on: December 07, 2022, 01:37:49 AM »
json.Start method _after_ the parent call.

Cheers
Bruce