FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index FiveWin for Harbour/xHarbour UTF-8, 2-Byte characters, Lower() and Upper()
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM

UTF-8, 2-Byte characters, Lower() and Upper()

Posted: Sat Jun 24, 2023 06:41 AM
The functions Lower() and Upper doesn't work as expected for UTF-8 2-Byte characters
Code (fw): Select all Collapse
function Main()

   local oDlg
   local oEdit
   local cVar1 := "lowerüöäßUPPER"
   local cVar2 := "UPPERÄÜÖßlower"

   REQUEST HB_CODEPAGE_UTF8
   HB_CDPSELECT( "UTF8" )
   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )

   ACTIVATE DIALOG oDlg CENTERED
RETURN NIL
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86
Posts: 8523
Joined: Tue Dec 20, 2005 07:36 PM

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Posted: Sat Jun 24, 2023 02:06 PM
Code (fw): Select all Collapse
// C:\FWH...\SAMPLES\FROSEUT8.PRG

#include "FiveWin.ch"

REQUEST HB_LANG_PT
REQUEST HB_CODEPAGE_PT850

// REQUEST HB_CODEPAGE_PTISO
// REQUEST HB_CODEPAGE_UTF8EX

FUNCTION Main()

   LOCAL oDlg
   LOCAL oEdit
   LOCAL cVar1 := "lowerüöäßUPPER"
   LOCAL cVar2 := "UPPERÄÜÖßlower"

   HB_LANGSELECT( 'PT' )     // Default language is now Portuguese
   HB_SETCODEPAGE( "PT850" )

   /*
   HB_CDPSELECT( "PTISO" )

   hb_cdpSelect( "UTF8EX" )
   */

   HB_CDPSELECT( "UTF8" )

   FW_SetUnicode( .T. )
   
   DEFINE DIALOG oDlg SIZE 600, 600 PIXEL TRUEPIXEL
   
   @  40, 20 EDIT oEdit VAR cVar1 SIZE 200,20 PIXEL OF oDlg
   
   @  60, 20 EDIT oEdit VAR cVar2 SIZE 200,20 PIXEL OF oDlg

   /*
   @  80, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ACTION MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )
   */

   @  90, 20 BUTTON "CHECK" SIZE 100,40 PIXEL OF oDlg ;
      ACTION( VIEW_UTF8( cVar1, cVar2 ) )

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIL

FUNCTION VIEW_UTF8( ccVar1, ccVar2 )

/*
MsgInfo( ;
      Lower( "Lower( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" ) + CRLF + CRLF + ;
      Upper( "Upper( |" + cVar1 + "|" + CRLF + "|" + cVar2 + "| )" );
      )*/

   ? OemToAnsi( LOWER( "Lower( |" + ccVar1 + "|" + CRLF + "|" + ccVar2 + "| )" ) )

   ? OemToAnsi( UPPER( "Upper( |" + ccVar1 + "|" + CRLF + "|" + ccVar2 + "| )" ) )

   // ? hb_strtoutf8( LOWER( ccVar1 ) )


RETURN NIL
João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Posted: Sat Jun 24, 2023 02:07 PM

By default Lower() and Upper() work with English characters only.

We need to set the codepage of the desired language

Regards



G. N. Rao.

Hyderabad, India
Posts: 392
Joined: Tue Mar 10, 2009 11:54 AM

Re: UTF-8, 2-Byte characters, Lower() and Upper()

Posted: Sun Jun 25, 2023 07:58 AM
karinha wrote:
Code (fw): Select all Collapse
...
karinha, thank you very much, helps for clarification.
nageswaragunupudi wrote:By default Lower() and Upper() work with English characters only.
We need to set the codepage of the desired language
Ok, understand.

So, if I am in a multi-language environment, e.g.:
  • - a dialog/browse that uses more than one language with diacritical marks
    - or want to search case-insensitively and does not know the source language of the search string
functions like U8Lower() and U8Upper() are essential!
Windows 11 Pro 22H2 22621.1848

Microsoft (R) Windows (R) Resource Compiler Version 10.0.10011.16384

Harbour 3.2.0dev (r2008190002)

FWH 23.10 x86

Continue the discussion