FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index FiveWin for Harbour/xHarbour TGet() - UTF8 encoding fails [Solved]
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
TGet() - UTF8 encoding fails [Solved]
Posted: Thu Sep 14, 2023 08:26 AM
UTF8 encoding fails in TGet()!
Code (fw): Select all Collapse
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 200,20 PIXEL OF oDlg VARCHAR 70 PICTURE "@!70"

   @ 240, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet():        " + cVar1 + " - " + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL


If using the paramters <VARCHAR/lnLimitChars> and/or <PICTURE/cPict> the encoding is changed from UTF-8 to Unicode when editing!
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TEdit() - UTF8 encoding fails
Posted: Thu Sep 14, 2023 08:01 PM
UTF-8 to Unicode
Utf-8 is Unicode
Probably you mean ANSI to UTF8.
Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails
Posted: Thu Sep 14, 2023 08:28 PM

Yes, the encoding switch from UTF8 to ANSI

Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Fri Oct 06, 2023 09:44 AM

Dear Mr. Nageswara Rao,

can you confirm the unwanted change of the encoding?

If so, do you plan to correct this behavior?

Many greetings

Frank

Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Mon Oct 09, 2023 07:39 AM

Looking into this.

Please wait a little

Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Mon Oct 09, 2023 08:55 AM
super, ok :D
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 12:11 AM
I copied your program as it is and built with FWH2307 and this is what I got.


However, there is a lot more to discuss about TGet and Umlauts.
Please wait for my next post.
Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 06:36 AM
yes, so far everything is in order.
But when editing, the encoding switches!
Please wait for my next post.
Ok, I will wait, it is not very urgent. In some places I have switched to TEdit(), but would like to return to TGet().
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 09:06 AM
But when editing, the encoding switches!
Please try this:
Code (fw): Select all Collapse
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   FW_SetUnicode( .T. )

   DEFINE DIALOG oDlg SIZE 300, 300 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg PICTURE "@!" VARCHAR 70 ;
      ON CHANGE oDlg:Update()

   @  60, 20 SAY cVar1 SIZE 250,30 PIXEL OF oDlg UPDATE

   @ 100, 20 SAY STRTOHEX( cVar1, " " ) SIZE 260,60 PIXEL OF oDlg UPDATE

   @ 200, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet(): " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL
Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 04:06 PM
nothing has changed!

If I put an 'a' at the end of the given characters, then the encoding changes to ANSI:
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 06:12 PM
I am running the code I posted.
I do not see an problems here.
Are you using FWH2307 please?
Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 08:56 PM
I noticed that all hexcodes in your example are ANSI and that there are NO UTF8 2-byte hexcodes!

Probably the encoding is already changed to ANSI before the TGet() was activated!?

Maybe it is the text object to display the hexcode directly?

I'll test it tomorrow 8)
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Wed Oct 11, 2023 10:33 PM
Probably the encoding is already changed to ANSI before the TGet() was activated!?
Yes.

We will discuss how you and other programmers would like the behavior to be.
Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Thu Oct 12, 2023 07:30 AM
Please try WITH VARCHAR and PICTURE :
Code (fw): Select all Collapse
#include "fivewin.ch"

function Main()

   local oDlg
   local oGet
   local cVar1 := "üäö"

   FW_SetUnicode( .T. )
   
   MsgInfo( "cVar1: " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   DEFINE DIALOG oDlg SIZE 300, 300 PIXEL TRUEPIXEL

   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg PICTURE "@!" VARCHAR 70 ;
      ON CHANGE oDlg:Update()

   @  60, 20 SAY cVar1 SIZE 250,30 PIXEL OF oDlg UPDATE

   @ 100, 20 SAY STRTOHEX( cVar1, " " ) SIZE 260,60 PIXEL OF oDlg UPDATE

   @ 200, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION MsgInfo( "oGet/TGet(): " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )

   ACTIVATE DIALOG oDlg CENTERED

   MsgInfo( "cVar1: " + Trim( cVar1 ) + CRLF + StrToHex( cVar1, " " ) )
RETURN NIL


cVar1 changes WITHOUT editing, but that can not be right!

And then without VARCHAR and PICTURE without editing:
Code (fw): Select all Collapse
   @  20, 20 GET oGet VAR cVar1 SIZE 250,30 PIXEL OF oDlg ;
   ON CHANGE oDlg:Update()

cVar1 doesn't change, that's OK!

Editing also works, the encoding is and remains UTF8!:



---------------------------------
As a reminder: The correct UTF8 hexcodes for 'üäö' are C3BC, C3A4 und C3B6, not DC C4 D6, see for example https://www.charset.org/utf-8!
DC C4 D6 are the ANSI hexcodes
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: TGet() - UTF8 encoding fails [Unsolved]
Posted: Thu Oct 12, 2023 11:01 PM
As a reminder: The correct UTF8 hexcodes for 'üäö' are C3BC, C3A4 und C3B6, not DC C4 D6, see for example https://www.charset.org/utf-8!
DC C4 D6 are the ANSI hexcodes
Code (fw): Select all Collapse
+---+--------+-----------------+
|STR|ANSI-HEX|UTF8-HEX         |
|üäö|FC E4 F6|C3 BC C3 A4 C3 B6|
|ÜÄÖ|DC C4 D6|C3 9C C3 84 C3 96|
+---+--------+-----------------+
With the picture clause "@!", "üäö" is converted to "ÜÄÖ" and hence the hex codes lile "DC C4 D6" are correct for Upper Case text
in ANSI
Regards



G. N. Rao.

Hyderabad, India