FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index FiveWin para Harbour/xHarbour Distancia de Levenshtein
Posts: 1816
Joined: Wed Oct 26, 2005 02:49 PM
Distancia de Levenshtein
Posted: Mon Jul 17, 2023 09:40 PM
Hola buenas tardes para todos, en este momento tenemos la necesidad de comparar un listado de nombres de productos, con la cadena de un nombre de producto, el problema es que la cadena no viene exacta, la idea es que se pueda sugerir la mas cercana posible.

Estuve consultando con ChatGTP y me sugiri贸 que lo podemos hacer utilizando el algoritmo de Levenshtein, incluso nos dio un ejemplo para usarlo con PHP y Python. Le pregunte si lo pod铆amos hacer con xharbour/harbour y me dice que no hay una librer铆a que lo tenga, pero que podemos hacer el algoritmo en C++, ya que xharbour esta basado en clipper, que a su vez esta basado en c++.

Buscando en Google encontr茅 el mencionado algoritmo escrito en C++.
Code (fw): Select all Collapse
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

int levenshtein(const string &s1, const string &s2)
{
聽 聽int N1 = s1.size();
聽 聽int N2 = s2.size();
聽 聽int i, j;
聽 聽vector<int> T(N2+1);

聽 聽for ( i = 0; i <= N2; i++ )
聽 聽 聽 T[i] = i;

聽 聽for ( i = 0; i < N1; i++ ) 
聽 聽{
聽 聽 聽 T[0] = i+1;
聽 聽 聽 int corner = i;
聽 聽 聽 for ( j = 0; j < N2; j++ ) 
聽 聽 聽 {
聽 聽 聽 聽 聽int upper = T[j+1];
聽 聽 聽 聽 聽if ( s1[i] == s2[j] )
聽 聽 聽 聽 聽 聽 T[j+1] = corner;
聽 聽 聽 聽 聽else
聽 聽 聽 聽 聽 聽 T[j+1] = min(T[j], min(upper, corner)) + 1;
聽 聽 聽 聽 聽corner = upper;
聽 聽 聽 }
聽 聽}
聽 聽return T[N2];
}
https://es.wikipedia.org/wiki/Distancia_de_Levenshtein

De casualidad alguno de los master, nos puede ayudar a traducir ese algoritmo para xharbour/harbour?

ChatGTP nos sugiri贸 este c贸digo para hacerlo. Pero salen varios errores.
Code (fw): Select all Collapse
//Algoritmo de Levenshtein
FUNCTION LevenshteinDistance(s1, s2)
聽 聽 LOCAL m, n, matriz, i, j, coste

聽 聽 m := Len(s1)
聽 聽 n := Len(s2)
聽 聽 matriz := Array(m+1, n+1)

聽 聽 FOR i := 0 TO m
聽 聽 聽 聽 matriz[i][0] := i
聽 聽 NEXT

聽 聽 FOR j := 0 TO n
聽 聽 聽 聽 matriz[0][j] := j
聽 聽 NEXT

聽 聽 FOR i := 1 TO m
聽 聽 聽 聽 FOR j := 1 TO n
聽 聽 聽 聽 聽 聽 IF s1[i] <> s2[j]
聽 聽 聽 聽 聽 聽 聽 聽 coste := 1
聽 聽 聽 聽 聽 聽 ELSE
聽 聽 聽 聽 聽 聽 聽 聽 coste := 0
聽 聽 聽 聽 聽 聽 ENDIF
聽 聽 聽 聽 聽 聽 matriz[i][j] := Min(matriz[i-1][j] + 1, matriz[i][j-1] + 1, matriz[i-1][j-1] + coste)
聽 聽 聽 聽 NEXT j
聽 聽 NEXT i

聽RETURN matriz[m][n]
Error E0021 Incorrect number of arguments: MIN

De antemano gracias
Saludos
LEANDRO AREVALO
Bogot谩 (Colombia)
https://hymlyma.com
https://hymplus.com/
leandroalfonso111@gmail.com
leandroalfonso111@hotmail.com

[ Turbo Incremental Link64 6.98 Embarcadero 7.70 ] [ FiveWin 25.01 ] [ xHarbour 64 bits) ]
Posts: 234
Joined: Tue Sep 01, 2009 07:55 AM
Re: Distancia de Levenshtein
Posted: Tue Jul 18, 2023 05:33 AM
Buenos d铆as:
Prueba a hacer esta modificaci贸n:
Code (fw): Select all Collapse
matriz[i][j] := Min(matriz[i-1][j] + 1,Min( matriz[i][j-1] + 1, matriz[i-1][j-1] + coste))
Por otra parte, no estoy seguro pero veo que
Code (fw): Select all Collapse
 聽 FOR i := 0 TO m
聽 聽 聽 聽 matriz[i][0] := i
聽 聽 NEXT

聽 聽 FOR j := 0 TO n
聽 聽 聽 聽 matriz[0][j] := j
聽 聽 NEXT
Y si no recuerdo mal en Harbour el primer elemento de la matriz o array es el 1
Un saludo
Jos茅 Luis
Posts: 1816
Joined: Wed Oct 26, 2005 02:49 PM
Re: Distancia de Levenshtein
Posted: Tue Jul 18, 2023 10:47 PM
Jos茅 Luis, gracias por las sugerencias, fueron realmente 煤tiles. Dejo la funci贸n funcionando jejejeje, por si alguien le interesa.
Code (fw): Select all Collapse
//Algoritmo de Levenshtein
FUNCTION LevenshteinDistance(s1, s2)
聽 聽 LOCAL m, n, matriz, i, j, coste, jCn, iCn

聽 聽 m := Len(s1)
聽 聽 n := Len(s2)
聽 聽 matriz := Array(m+1, n+1)

聽 聽 iCn := 0
聽 聽 FOR i := 1 TO m+1
聽 聽 聽 聽 matriz[i][1] := iCn
聽 聽 聽 聽 iCn++
聽 聽 NEXT

聽 聽 jCn := 0
聽 聽 FOR j := 1 TO n+1
聽 聽 聽 聽 matriz[1][j] := jCn
聽 聽 聽 聽 jCn++
聽 聽 NEXT

聽 聽 FOR i := 2 TO m+1
聽 聽 聽 聽 FOR j := 2 TO n+1
聽 聽 聽 聽 聽 聽 IF substr(s1,i-1,1) != substr(s2,j-1,1)
聽 聽 聽 聽 聽 聽 聽 聽 coste := 1
聽 聽 聽 聽 聽 聽 ELSE
聽 聽 聽 聽 聽 聽 聽 聽 coste := 0
聽 聽 聽 聽 聽 聽 ENDIF
聽 聽 聽 聽 聽 聽 matriz[i][j] := Min(matriz[i-1][j] + 1,Min( matriz[i][j-1] + 1, matriz[i-1][j-1] + coste))
聽 聽 聽 聽 NEXT j
聽 聽 NEXT i
聽 聽 
RETURN matriz[m+1][n+1]
Saludos
LEANDRO AREVALO
Bogot谩 (Colombia)
https://hymlyma.com
https://hymplus.com/
leandroalfonso111@gmail.com
leandroalfonso111@hotmail.com

[ Turbo Incremental Link64 6.98 Embarcadero 7.70 ] [ FiveWin 25.01 ] [ xHarbour 64 bits) ]
Posts: 729
Joined: Tue Oct 18, 2005 06:49 PM
Re: Distancia de Levenshtein
Posted: Wed Jul 19, 2023 03:18 AM

Leandro,

la funcion StrDiff(), en xHarbour, te calcula la funcion Levenshtein.

Posts: 10733
Joined: Sun Nov 19, 2006 05:22 AM
Re: Distancia de Levenshtein
Posted: Wed Jul 19, 2023 06:44 AM
George wrote:Leandro,
la funcion StrDiff(), en xHarbour, te calcula la funcion Levenshtein.
Yes.
This is the documentation of the function

StrDiff()
Calculates the similarity of two strings.
Syntax
StrDiff( <cString1> , ;
<cString2> , ;
[<nReplace>], ;
[<nDelete>] , ;
[<nInsert>] ) --> nSimilarity

Arguments
<cString1> and <cString2>
Thes are two character strings being compared for their similarity.
<nReplace>
The weighing factor for Replace operations defaults to 3. It can be changed to a numeric value between 0 and 255.
<nDelete>
The weighing factor for Delete operations defaults to 6. It can be changed to a numeric value between 0 and 255. Delete operations are considered the most expensive ones.
<nInsert>
The weighing factor for Insert operations defaults to 1. It can be changed to a numeric value between 0 and 255. Return
The function returns a numeric value indicating the similarity of two character strings.
Description
The function calculates the Levenshtein distance which indicates the similarity of two character strings. The algorithm weighs Delete, Insert and Replace operations required to transform <cString1> into <cString2>. The weighing factors of each operation influence the result. It is assumed that two strings are more similar the smaller the result is.

https://en.wikipedia.org/wiki/Levenshtein_distance
Regards



G. N. Rao.

Hyderabad, India
Posts: 1816
Joined: Wed Oct 26, 2005 02:49 PM
Re: Distancia de Levenshtein
Posted: Wed Jul 19, 2023 06:33 PM
Excelente Mr.Nages

Muchas gracias por la informaci贸n :D
Saludos
LEANDRO AREVALO
Bogot谩 (Colombia)
https://hymlyma.com
https://hymplus.com/
leandroalfonso111@gmail.com
leandroalfonso111@hotmail.com

[ Turbo Incremental Link64 6.98 Embarcadero 7.70 ] [ FiveWin 25.01 ] [ xHarbour 64 bits) ]

Continue the discussion