Encodings

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Encodings

Davor Spasoski-2

Dear kannel users&developers,

 

Can someone give precise information what happens encoding wise from smsbox to SMSC. I understand that as of 1.4.1:

 

Smsbox i expecting utf-8 by default

Communication smsbox ßà bearerbox is only via utf-8

Bearerbox ßà SMSC is supposed to be ISO-8859-1

 

But then we have alt-dcs and alt-addr-charset that are supposed to enable GSM-7 alphabet between SMSC and bearerbox, but although documented, they both don’t seem to work from 1.4.2 onwards. There is a slight difference when I add alt-charset=GSM, but it certainly is not sending GSM. (I get a lot of question marks until I get to 0x28 character)

What if I have specific SMSC that is using GSM-7 or even something more weird like Escaped ISO-8859-1 that combines ISO and GSM 7-bit.

Is SMSC – bearerbox in UTF-8 possible?

 

BR,

 

Davor Spasoski

 




Disclaimer: one.Vip DOO Skopje
This e-mail (including any attachments) is confidential and may be protected by legal privilege. If you are not the intended recipient, you should not copy it, re-transmit it, use it or disclose its contents, but should return it to the sender immediately and delete your copy from your system. Any unauthorized use or dissemination of this message in whole or in part is strictly prohibited. Please note that e-mails are susceptible to change. one.Vip DOO Skopje shall not be liable for the improper or incomplete transmission of the information contained in this communication nor for any delay in its receipt or damage to your system.
Please, do not print this e-mail unless it is necessary! Think about saving the environment!

Напомена: оне.Вип ДОО Скопје
Оваа електронска порака (вклучувајќи ги и прилозите) е доверлива и може да биде заштитена со правни привилегии. Доколку не сте лицето на кое таа му е наменета пораката, не треба да ја копирате, дистрибуирате или да ја откривате нејзината содржина, туку веднаш да ја препратите до испраќачот и да ја избришете оригиналната порака и сите нејзини копии од Вашиот компјутерски систем. Секое неовластено користење на оваа порака во целост или делови од истата е строго забрането. Ве молиме да забележите дека електронските пораки се подложни на промени. оне.Вип ДОО Скопје не презема одговорност за несоодветно или нецелосно пренесување на информациите содржани во оваа комуникација, ниту пак за било какво задоцнување на приемот или оштетувања на вашиот систем.
Ве молиме не ја печатете оваа порака освен ако не е неопходно! Зачувајте ја природата!
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Encodings

Stipe Tolj-2
Am 02.04.17 21:57, schrieb Davor Spasoski:
> Dear kannel users&developers,

Hi Davor,

please don't cross-post into several mailing list, we consider this spaming.

Your questions is more related to internals, so devel@ should be the
right place to ask.

> Can someone give precise information what happens encoding wise from
> smsbox to SMSC. I understand that as of 1.4.1:
>
> Smsbox i expecting utf-8 by default

correct, the sendsms HTTP interface assumes UTF-8 encoding as input, (if
not otherwise indicated via the 'coding' and 'charset' HTTP GET variables).

> Communication smsbox ßàbearerbox is only via utf-8

IF the message is considered to be textual (coding=0), yes, UTF-8 is the
internal encoding.

IF coding=1 is indicated then it's raw byte stream, with no encoding
implicated.

IF coding=2 then the internal encoding will leave UCS-2.

> Bearerbox ßàSMSC is supposed to be ISO-8859-1

nop, that's latin1. Depending on the SMSC type there are different
upstream encodings used as default.

I.e. for SMPP the default encoding (aka data coding scheme, DCS 0x00) is
GSM 03.38.

> But then we have alt-dcs and alt-addr-charset that are supposed to
> enable GSM-7 alphabet between SMSC and bearerbox, but although
> documented, they both don’t seem to work from 1.4.2 onwards. There is a
> slight difference when I add alt-charset=GSM, but it certainly is not
> sending GSM. (I get a lot of question marks until I get to 0x28 character)

The config 'alt-charset' in the SMPP config groups defines which default
alphabet the SMSC assumes for it's DCS 0x00 encoding.

Keep in mind that 'alt-charset' relies on the iconv() library, and this
does NOT include GSM 03.38, so there is no value for GSM 03.38 encoding
that can be defined via 'alt-charset', which is also not required since
it is default. Only all other default encodings can be switched to via
this config directive.

> What if I have specific SMSC that is using GSM-7 or even something more
> weird like Escaped ISO-8859-1 that combines ISO and GSM 7-bit.
>
> Is SMSC – bearerbox in UTF-8 possible?

yes, 'alt-charset = UTF-8' would simply send the payload as UTF-8
encoded text. AFAIR, the HTTP SMSC types do this.

--
Best Regards,
Stipe Tolj

-------------------------------------------------------------------
Düsseldorf, NRW, Germany

Kannel Foundation                 tolj.org system architecture
http://www.kannel.org/            http://www.tolj.org/

[hidden email]                  [hidden email]
-------------------------------------------------------------------

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RE: Encodings

Davor Spasoski-2
Hi Stipe,

Thank you for your reply. I apologize for cross-posting. This is really usuful information.
I made some tests with few versions ftom 1.4.4 to SVN and it is consistent. Kill me if I'm wrong, but I remember that way long ago and with older versions and browsers I was able to url-encode the GSM characters with their hex value and get them properly on the handset, usually using alt-dcs=1 on our Comverse SMSC. My mistake with my tests is that I was doing the same now, but the browser

The enlitement for me is the alt-charset setting which was not clear to me from the userguide.

One more question: alt-addr-charset is there to prevent PDU breaking if 0x00 is in the address. But how come 0x00 in the short_message does not break it with GSM 7-bit?

Thanks a lot again!

Davor Spasoski

-----Original Message-----
From: devel [mailto:[hidden email]] On Behalf Of Stipe Tolj
Sent: 06 April 2017 11:18
Cc: [hidden email]
Subject: Re: Encodings

Am 02.04.17 21:57, schrieb Davor Spasoski:
> Dear kannel users&developers,

Hi Davor,

please don't cross-post into several mailing list, we consider this spaming.

Your questions is more related to internals, so devel@ should be the right place to ask.

> Can someone give precise information what happens encoding wise from
> smsbox to SMSC. I understand that as of 1.4.1:
>
> Smsbox i expecting utf-8 by default

correct, the sendsms HTTP interface assumes UTF-8 encoding as input, (if not otherwise indicated via the 'coding' and 'charset' HTTP GET variables).

> Communication smsbox ßàbearerbox is only via utf-8

IF the message is considered to be textual (coding=0), yes, UTF-8 is the internal encoding.

IF coding=1 is indicated then it's raw byte stream, with no encoding implicated.

IF coding=2 then the internal encoding will leave UCS-2.

> Bearerbox ßàSMSC is supposed to be ISO-8859-1

nop, that's latin1. Depending on the SMSC type there are different upstream encodings used as default.

I.e. for SMPP the default encoding (aka data coding scheme, DCS 0x00) is GSM 03.38.

> But then we have alt-dcs and alt-addr-charset that are supposed to
> enable GSM-7 alphabet between SMSC and bearerbox, but although
> documented, they both don’t seem to work from 1.4.2 onwards. There is
> a slight difference when I add alt-charset=GSM, but it certainly is
> not sending GSM. (I get a lot of question marks until I get to 0x28
> character)

The config 'alt-charset' in the SMPP config groups defines which default alphabet the SMSC assumes for it's DCS 0x00 encoding.

Keep in mind that 'alt-charset' relies on the iconv() library, and this does NOT include GSM 03.38, so there is no value for GSM 03.38 encoding that can be defined via 'alt-charset', which is also not required since it is default. Only all other default encodings can be switched to via this config directive.

> What if I have specific SMSC that is using GSM-7 or even something
> more weird like Escaped ISO-8859-1 that combines ISO and GSM 7-bit.
>
> Is SMSC – bearerbox in UTF-8 possible?

yes, 'alt-charset = UTF-8' would simply send the payload as UTF-8 encoded text. AFAIR, the HTTP SMSC types do this.

--
Best Regards,
Stipe Tolj

-------------------------------------------------------------------
Düsseldorf, NRW, Germany

Kannel Foundation                 tolj.org system architecture
http://www.kannel.org/            http://www.tolj.org/

[hidden email]                  [hidden email]
-------------------------------------------------------------------



________________________________

Disclaimer: one.Vip DOO Skopje
This e-mail (including any attachments) is confidential and may be protected by legal privilege. If you are not the intended recipient, you should not copy it, re-transmit it, use it or disclose its contents, but should return it to the sender immediately and delete your copy from your system. Any unauthorized use or dissemination of this message in whole or in part is strictly prohibited. Please note that e-mails are susceptible to change. one.Vip DOO Skopje shall not be liable for the improper or incomplete transmission of the information contained in this communication nor for any delay in its receipt or damage to your system.
Please, do not print this e-mail unless it is necessary! Think about saving the environment!

Напомена: оне.Вип ДОО Скопје
Оваа електронска порака (вклучувајќи ги и прилозите) е доверлива и може да биде заштитена со правни привилегии. Доколку не сте лицето на кое таа му е наменета пораката, не треба да ја копирате, дистрибуирате или да ја откривате нејзината содржина, туку веднаш да ја препратите до испраќачот и да ја избришете оригиналната порака и сите нејзини копии од Вашиот компјутерски систем. Секое неовластено користење на оваа порака во целост или делови од истата е строго забрането. Ве молиме да забележите дека електронските пораки се подложни на промени. оне.Вип ДОО Скопје не презема одговорност за несоодветно или нецелосно пренесување на информациите содржани во оваа комуникација, ниту пак за било какво задоцнување на приемот или оштетувања на вашиот систем.
Ве молиме не ја печатете оваа порака освен ако не е неопходно! Зачувајте ја природата!
Loading...