Sorry, I'm not familiar with Unicode. Just give me a sample showing the problem and I'll see what I can do.
EMG
Sorry, I'm not familiar with Unicode. Just give me a sample showing the problem and I'll see what I can do.
EMG
From the beginning, we are going in the wrong direction. I am sorry for this deviation.
This is not conversion to and from UTF8 but of WideChar. (16bit char encoding).
Windows internally works with WideChar, i.e., 64 bit (little endian) characters.
In 8bit notation
"AB" is "4142" in hex
In 16bit notation
"AB" is "41004200" in hex
So, what's the point?
EMG
BIN2W( cWideString ) should give the value equivalent to AscW()
#include "fivewin.ch"
function Main()
local cUtf8, cWide, nAsc
cUtf8 := "అ"
? cUtf8
cWide := UTF8TOUTF16( cUtf8 )
nAsc := BIN2W( cWide )
? nAsc
? "Proof", HB_UTF8CHR( nAsc )
return nilSo, what about HB_UTF8ASC()? Do we need it?
EMG
Enrico Maria Giordano wrote:So, what about HB_UTF8ASC()? Do we need it?
EMG
Done. Please try xHarbour build 10253.
EMG
Thank you.
FWH is now using HB_UTF8CHR(). There is no problem either with Harbour or xHarbour.
Till now there is no need to use HB_UTF8ASC().
But, if and when the need arises, what shall we do? There will be many users of older versions of xHarbour. Looks like we need to force them to upgrade xHarbour.
But what about xharbour.com users? They will get unresolved externals issue.
#pragma BEGINDUMP
#include "error.ch"
#include "hbapierr.h"
static BOOL utf8tou16nextchar( UCHAR ucChar, int * n, USHORT * uc )
{
if( *n > 0 )
{
if( ( ucChar & 0xc0 ) != 0x80 )
return FALSE;
*uc = ( *uc << 6 ) | ( ucChar & 0x3f );
( *n )--;
return TRUE;
}
*n = 0;
*uc = ucChar;
if( ucChar >= 0xc0 )
{
if( ucChar < 0xe0 )
{
*uc &= 0x1f;
*n = 1;
}
else if( ucChar < 0xf0 )
{
*uc &= 0x0f;
*n = 2;
}
else if( ucChar < 0xf8 )
{
*uc &= 0x07;
*n = 3;
}
else if( ucChar < 0xfc )
{
*uc &= 0x03;
*n = 4;
}
else if( ucChar < 0xfe )
{
*uc &= 0x01;
*n = 5;
}
}
return TRUE;
}
HB_FUNC( HB_UTF8ASC )
{
const char * pszString = hb_parc( 1 );
if( pszString )
{
HB_SIZE nLen = hb_parclen( 1 );
USHORT wc = 0;
int n = 0;
while( nLen )
{
if( ! utf8tou16nextchar( ( unsigned char ) *pszString, &n, &wc ) )
break;
if( n == 0 )
break;
pszString++;
nLen--;
}
hb_retnint( wc );
}
else
hb_errRT_BASE_SubstR( EG_ARG, 3012, NULL, HB_ERR_FUNCNAME, HB_ERR_ARGS_BASEPARAMS );
}
#pragma ENDDUMP