FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index FiveWin para Harbour/xHarbour problema en Copy/Paste sobre un Get
Posts: 92
Joined: Fri Nov 18, 2005 11:15 PM
Re: problema en Copy/Paste sobre un Get
Posted: Fri Mar 13, 2026 09:04 PM

Hola Antonio Interesante ver competir a las IA y que propongan una solución mas limpia. Todos ganamos.

Saludos

Ralph del Castillo

Lima PERU

Fwh 24.07, xHb123_10193, MySQL 8.x, BCC 7.3
Posts: 607
Joined: Mon Mar 04, 2013 04:32 PM
Re: problema en Copy/Paste sobre un Get
Posted: Wed Apr 08, 2026 07:01 AM

Hola compañeros.

Solucion mucho mas facil :D

probado en la beta 9 Fivewin 2603 y el problema esta solucionado de fabrica.
El copy paste respeta las Ñs.

Saludos.

Jose.

Fwh 24.07 64 bits + Harbour 64 bits 3.2dev(r2407221137) + MSVC64
Posts: 6983
Joined: Fri Oct 07, 2005 07:07 PM
Re: problema en Copy/Paste sobre un Get
Posted: Wed Apr 08, 2026 08:33 AM

After extensive testing, I want to clarify something important:

The suggested UTF-8 based fixes are technically correct — but they are not a practical solution for existing DBF systems.

DBF files are byte-based, not character-based.
UTF-8 uses variable-length encoding, which directly breaks core DBF assumptions.

In real-world systems where multiple legacy applications share the same DBF data:

  • Fields have fixed byte lengths
  • Indexes depend on exact byte sequences
  • Applications expect a specific ANSI codepage

Switching to UTF-8 causes:

  • Data truncation (multi-byte characters exceed field size)
  • Broken or inconsistent indexes
  • Incompatible behavior across existing applications

This is not a minor change — it is effectively a data model migration, not an encoding tweak.


---

Why this matters (simple example)

String: Müller

ANSI (Windows-1252):

4D FC 6C 6C 65 72   → 6 bytes

UTF-8:

4D C3 BC 6C 6C 65 72   → 7 bytes

Same text, different storage size.

Now imagine a DBF field defined as C(6):

  • ANSI → fits perfectly
  • UTF-8 → overflow / truncation

Indexes built on this field will no longer match.


---

Conclusion

If you are working with shared, legacy DBF systems:

  • UTF-8 is not a drop-in solution
  • You cannot safely switch encodings without restructuring data and all dependent applications

The only realistic approach in such environments is:

  • Keep a consistent ANSI codepage
  • Normalize all external input (UTF-8 → ANSI) before writing
  • Optionally fix mixed data on read

Everything else will introduce subtle and hard-to-debug data corruption issues.

Additional note about LEFT() and UTF-8

Another important detail that is often overlooked:

In Harbour/xBase, functions like LEFT() are byte-based, not character-based.

This works fine with ANSI (single-byte encoding), but it breaks with UTF-8.

Example:

String: Müllerstraße

ANSI (1 byte per character):

LEFT("MÜLLERSTRAßE", 5) → "MÜLLE"

UTF-8 (multi-byte characters):

"MÜLLERSTRAßE" = 4D C3 9C 4C 4C 45 ...
                  M  Ü      L  L  E

Now:

LEFT("MÜLLERSTRAßE", 2)

Result (byte-based cut):

4D C3  → "MÃ"

The UTF-8 character "Ü" (C3 9C) is cut in half, producing a corrupted string.


---

Implication

This is another reason why UTF-8 is problematic in classic DBF environments:

  • String functions (LEFT, SUBSTR, etc.) operate on bytes
  • DBF fields are defined in bytes
  • Indexes depend on byte-exact values

UTF-8 breaks all of these assumptions.


---

Correct approach (if UTF-8 is used)

You must use UTF-8 aware functions:

HB_UTF8LEFT()
HB_UTF8SUBSTR()
HB_UTF8LEN()

---

Conclusion

UTF-8 is not just a different encoding — it changes how strings must be handled at every level.

This makes it incompatible with traditional DBF logic unless the entire system is adapted.

Posts: 8515
Joined: Tue Dec 20, 2005 07:36 PM
Re: problema en Copy/Paste sobre un Get
Posted: Wed Apr 08, 2026 02:11 PM

Querido Otto, por Dios, estamos en un foro de programadores, responde con CÓDIGO DE PROGRAMACIÓN. Demuéstralo en la práctica mediante programación. Esa biblia que escribiste no sirve para hacer pruebas.

My dear Otto, for God's sake, we're in a programmers' forum, answer VIA PROGRAMMING CODE. Show it in practice VIA PROGRAMMING. That bible you wrote is useless for testing.

Gracias, tks.

Regards, saludos.

João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
Posts: 6983
Joined: Fri Oct 07, 2005 07:07 PM
Re: problema en Copy/Paste sobre un Get
Posted: Wed Apr 08, 2026 09:58 PM

Dear João, I think it is important to understand how characters are stored in the DBF database. Best regards, Otto

Posts: 1054
Joined: Sun Oct 09, 2005 10:41 PM
Re: problema en Copy/Paste sobre un Get
Posted: Fri Apr 10, 2026 02:51 AM

MI estimado, prueba con esto,,,
Luego de iniciar el Main()

Function MAin()
SetGeneral()
...
..
..
..
Return(NIl)
///----------------------------------------------------
Function SetGeneral()
HB_LANGSELECT( 'ES' )       //Selecciona lenguaje español
HB_SETCODEPAGE( 'ESWIN' )
HB_CDPSELECT("ESWIN")
HB_LangSelect( "ES" )
HB_CODEPAGE_ESWIN()              //FW_SetUnicode( .t. )
Return(NIl)
Posts: 8515
Joined: Tue Dec 20, 2005 07:36 PM
Re: problema en Copy/Paste sobre un Get
Posted: Fri Apr 10, 2026 12:23 PM
// C:\FWH2603\SAMPLES\WILLI2.PRG

#include "fivewin.ch"

REQUEST HB_LANG_ES
REQUEST HB_CODEPAGE_ESWIN

#ifNdef __XHARBOUR__     // Somente para HARBOUR, XHARBOUR nao tem isso ainda.
   REQUEST HB_CODEPAGE_UTF8
   REQUEST HB_CODEPAGE_UTF8EX
#endif

FUNCTION MAin()

   LOCAL oDlg
   LOCAL oGet1, oGet2, oGet3, oGet4
   LOCAL cVar1, cVar2, cVar3, cVar4
   LOCAL lActive := .f.

   SetGeneral()

   FW_SetUnicode( .T. ) // NO FUNCIONA CON FWH26.03

   cVar1 := SPACE(100)
   cVar2 := SPACE(100)
   cVar3 := SPACE(100)
   cVar4 := SPACE(100)

   DEFINE DIALOG oDlg TITLE "From Code" PIXEL SIZE 300, 300

   oDlg:lHelpIcon := .F.

   @ 10,10 get oGet1 var cVar1 bitmap "..\bitmaps\on.bmp" ;
      action( msginfo( "With Transparent" ) ) of oDlg pixel size 60,12

   oGet1:lBtnTransparent := .T.       // transparent button get oGet1

   @ 40,10 get oGet2 var cVar2 bitmap "..\bitmaps\on.bmp" ;
      action( msginfo( "Without Transparent" ) ) of oDlg pixel size 60,12

   @ 70,10 get oGet3 var cVar3 bitmap "..\bitmaps\chkyes.bmp" ;
      action( msginfo( "With Adjust-Transparent" ) ) of oDlg pixel size 120,12

   oGet3:lBtnTransparent := .T.       // transparent button get oGet3
   oGet3:lAdjustBtn      := .T.       // Button Get Adjust Witdh oGet3
   oGet3:lDisColors      := .F.       // Deactive disable color
   oGet3:nClrTextDis     := CLR_WHITE // Color text disable status
   oGet3:nClrPaneDis     := CLR_BLUE  // Color Pane disable status

   @ 100,10 get oGet4 var cVar4 bitmap "..\bitmaps\chkyes.bmp" ;
      action( if( lActive,oGet3:disable(),oGet3:enable()), lActive:= !lActive, oDlg:update() ) ;
      of oDlg pixel size 120,12
   

   oGet4:lAdjustBtn      := .T.
   

   ACTIVATE DIALOG oDlg CENTERED

RETURN NIl

FUNCTION SetGeneral()

   HB_LANGSELECT( 'ES' )       // Selecciona lenguaje español
   HB_SETCODEPAGE( 'ESWIN' )
   HB_CDPSELECT("ESWIN")
   HB_LangSelect( "ES" )
   HB_CODEPAGE_ESWIN()         // FW_SetUnicode( .t. )

RETURN NIL

// FIN / END

Regards, saludos.

João Santos - São Paulo - Brasil - Phone: +55(11)95150-7341
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Re: problema en Copy/Paste sobre un Get
Posted: Sat Apr 11, 2026 05:25 AM
regards, saludos

Antonio Linares
www.fivetechsoft.com
Posts: 6983
Joined: Fri Oct 07, 2005 07:07 PM
Re: problema en Copy/Paste sobre un Get
Posted: Sat Apr 11, 2026 07:12 AM

The fix from Antonio solves internal issues in FiveWin, but it does not address problems coming from external sources.

The main cause is the use of different character encodings on various websites, especially when copying and pasting content. Many Eastern European websites still use legacy encodings like Windows-1250, Windows-1251, or ISO variants instead of proper UTF-8.

Because of our geographical location, we are much more frequently exposed to these mixed encodings in daily work. In practice, this means that even with Unicode enabled, the data we receive is often already inconsistently encoded before it reaches the application.

In these cases, there is unfortunately no universal automatic solution. The only reliable approach is manual handling (“handmade”), meaning detecting the source encoding and converting it explicitly when needed.

Continue the discussion