FiveTech Support Forums

FiveWin / Harbour / xBase community
Board index AI Introduction (Harbour code and samples) Understanding Transformers
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Understanding Transformers
Posted: Sat Jul 06, 2024 10:10 AM
Transformers are described in the document "attention is all you need" and are the architecture used by AI large language models (chatgpt, etc):
Code (fw): Select all Collapse
FUNCTION Main()

聽 聽 LOCAL aEmbeddings, aWq, aWk, aWv, aBq, aBk, aBv
聽 聽 LOCAL aQ, aK, aV
聽 聽 LOCAL aAttentionScores, aOutput

聽 聽 // Simulamos embeddings y matrices de peso (normalmente ser铆an cargados o generados)
聽 聽 aEmbeddings := GenerateRandomMatrix(3, 4) 聽// [batch_size, seq_len, d_model]
聽 聽 aWq := GenerateRandomMatrix(4, 2) 聽// [d_model, d_k]
聽 聽 aWk := GenerateRandomMatrix(4, 2)
聽 聽 aWv := GenerateRandomMatrix(4, 2)
聽 聽 aBq := GenerateRandomVector(2) 聽// [d_k]
聽 聽 aBk := GenerateRandomVector(2)
聽 聽 aBv := GenerateRandomVector(2)
聽 聽 
聽 聽 ? aEmbeddings
聽 聽 
聽 聽 // Realizamos las transformaciones lineales
聽 聽 aQ := LinearTransformation(aEmbeddings, aWq, aBq)
聽 聽 aK := LinearTransformation(aEmbeddings, aWk, aBk)
聽 聽 aV := LinearTransformation(aEmbeddings, aWv, aBv)
聽 聽 
聽 聽 // Calculamos las puntuaciones de atenci贸n
聽 聽 aAttentionScores := CalculateAttentionScores(aQ, aK)
聽 聽 
聽 聽 // Aplicamos las puntuaciones de atenci贸n a los valores
聽 聽 aOutput := ApplyAttention(aAttentionScores, aV)
聽 聽 
聽 聽 // Imprimimos los resultados
聽 聽 ? "Query:", aQ
聽 聽 ? "Key:", aK
聽 聽 ? "Value:", aV
聽 聽 ? "Attention Scores:", aAttentionScores
聽 聽 ? "Output:", aOutput

RETURN NIL

FUNCTION LinearTransformation(aX, aW, aB)
聽 聽 LOCAL aResult, i, j, k, nSum
聽 聽 LOCAL nRows := Len(aX), nCols := Len(aW[1]), nInner := Len(aW)
聽 聽 
聽 聽 aResult := Array(nRows)
聽 聽 FOR i := 1 TO nRows
聽 聽 聽 聽 aResult[i] := Array(nCols)
聽 聽 聽 聽 FOR j := 1 TO nCols
聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nInner
聽 聽 聽 聽 聽 聽 聽 聽 nSum += aX[i][k] * aW[k][j]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 aResult[i][j] := nSum + aB[j]
聽 聽 聽 聽 NEXT
聽 聽 NEXT
聽 聽 
RETURN aResult

FUNCTION GenerateRandomMatrix(nRows, nCols)
聽 聽 LOCAL aMatrix := Array(nRows, nCols), i, j
聽 聽 FOR i := 1 TO nRows
聽 聽 聽 聽 FOR j := 1 TO nCols
聽 聽 聽 聽 聽 聽 aMatrix[i,j] := hb_Random(-1, 1)
聽 聽 聽 聽 NEXT 聽 聽
聽 聽 NEXT
RETURN aMatrix

FUNCTION GenerateRandomVector(nSize)
聽 聽 LOCAL aVector := Array(nSize), i
聽 聽 FOR i := 1 TO nSize
聽 聽 聽 聽 aVector[i] := hb_Random(-1, 1)
聽 聽 NEXT
RETURN aVector

FUNCTION CalculateAttentionScores(aQ, aK)
聽 聽 LOCAL aScores, i, j, k, nSum, nExpSum
聽 聽 LOCAL nRowsQ := Len(aQ), nColsQ := Len(aQ[1])
聽 聽 LOCAL nRowsK := Len(aK), nColsK := Len(aK[1])
聽 聽 
聽 聽 // aQ y aK deben tener el mismo n煤mero de columnas (d_k)
聽 聽 IF nColsQ <> nColsK
聽 聽 聽 聽 ? "Error: Las dimensiones de aQ y aK no coinciden"
聽 聽 聽 聽 RETURN NIL
聽 聽 ENDIF
聽 聽 
聽 聽 aScores := Array(nRowsQ, nRowsK)
聽 聽 FOR i := 1 TO nRowsQ
聽 聽 聽 聽 FOR j := 1 TO nRowsK
聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nColsQ
聽 聽 聽 聽 聽 聽 聽 聽 nSum += aQ[i][k] * aK[j][k]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 aScores[i][j] := nSum / Sqrt(nColsQ) 聽// Escalado de las puntuaciones de atenci贸n
聽 聽 聽 聽 NEXT
聽 聽 NEXT
聽 聽 
聽 聽 // Aplicamos la normalizaci贸n softmax
聽 聽 FOR i := 1 TO nRowsQ
聽 聽 聽 聽 nExpSum := 0
聽 聽 聽 聽 FOR j := 1 TO nRowsK
聽 聽 聽 聽 聽 聽 aScores[i][j] := Exp(aScores[i][j])
聽 聽 聽 聽 聽 聽 nExpSum += aScores[i][j]
聽 聽 聽 聽 NEXT
聽 聽 聽 聽 FOR j := 1 TO nRowsK
聽 聽 聽 聽 聽 聽 aScores[i][j] /= nExpSum
聽 聽 聽 聽 NEXT
聽 聽 NEXT
聽 聽 
RETURN aScores

FUNCTION ApplyAttention(aScores, aV)
聽 聽 LOCAL aOutput, i, j, k, nSum
聽 聽 LOCAL nRows := Len(aScores), nCols := Len(aV[1]), nInner := Len(aV)
聽 聽 
聽 聽 aOutput := Array(nRows, nCols)
聽 聽 FOR i := 1 TO nRows
聽 聽 聽 聽 FOR j := 1 TO nCols
聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nInner
聽 聽 聽 聽 聽 聽 聽 聽 nSum += aScores[i][k] * aV[k][j]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 aOutput[i][j] := nSum
聽 聽 聽 聽 NEXT
聽 聽 NEXT
聽 聽 
RETURN aOutput
{{-0.20, -0.33, -0.13, 0.75}, {0.56, 0.31, 0.19, -0.09}, {-0.26, 0.48, 0.73, -0.32}}
Query: {{0.6859, -0.0584}, {1.3492, 0.9291}, {1.0082, 1.1412}}
Key: {{0.3594, 1.1780}, {1.0069, 1.3886}, {0.8579, 0.6985}}
Value: {{-0.2781, -0.6665}, {-1.0988, 0.3276}, {-0.3004, 0.3100}}
Attention Scores: {{0.27, 0.37, 0.36}, {0.23, 0.49, 0.27}, {0.26, 0.49, 0.25}}
Output: {{-0.590643, 0.049439}, {-0.690302, 0.091827}, {-0.684619, 0.064918}}
regards, saludos

Antonio Linares
www.fivetechsoft.com
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Re: Understanding Transformers
Posted: Sat Jul 06, 2024 10:51 AM
Implementing multi heads support:
Code (fw): Select all Collapse
FUNCTION Main()

聽 聽 LOCAL aInput, aPositionalEncoding, aEncoderOutput
聽 聽 LOCAL nBatchSize := 2, nSeqLen := 5, nModelDim := 8, nHeads := 2

聽 聽 // Generar input de ejemplo
聽 聽 aInput := GenerateRandomMatrix(nBatchSize, nSeqLen, nModelDim)

聽 聽 // Generar codificaci贸n posicional
聽 聽 aPositionalEncoding := GeneratePositionalEncoding(nSeqLen, nModelDim)

聽 聽 // A帽adir codificaci贸n posicional al input
聽 聽 aInput := AddPositionalEncoding(aInput, aPositionalEncoding)

聽 聽 // Crear y aplicar el encoder
聽 聽 aEncoderOutput := TransformerEncoder(aInput, nHeads, 2) // 2 capas de encoder

聽 聽 ? "Input con codificaci贸n posicional:"
聽 聽 PrintMatrix(aInput)
聽 聽 ? "Salida del Encoder:"
聽 聽 PrintMatrix(aEncoderOutput)

RETURN NIL

FUNCTION TransformerEncoder(aInput, nHeads, nLayers)
聽 聽 LOCAL aOutput := aInput
聽 聽 LOCAL i

聽 聽 FOR i := 1 TO nLayers
聽 聽 聽 聽 // Multi-Head Attention
聽 聽 聽 聽 aOutput := AddAndNorm(aOutput, MultiHeadAttention(aOutput, nHeads))
聽 聽 聽 聽 
聽 聽 聽 聽 // Feed Forward
聽 聽 聽 聽 aOutput := AddAndNorm(aOutput, FeedForward(aOutput))
聽 聽 NEXT

RETURN aOutput

FUNCTION MultiHeadAttention(aInput, nHeads)
聽 聽 LOCAL aOutputs := {}, aFinalOutput, i
聽 聽 LOCAL nBatchSize := Len(aInput), nSeqLen := Len(aInput[1]), nModelDim := Len(aInput[1, 1])
聽 聽 LOCAL nHeadDim := Int(nModelDim / nHeads)
聽 聽 LOCAL aWq, aWk, aWv, aQ, aK, aV, aAttentionScores, aHeadOutput, aWo
聽 聽 
聽 聽 FOR i := 1 TO nHeads
聽 聽 聽 聽 aWq := GenerateRandomMatrix(nModelDim, nHeadDim)
聽 聽 聽 聽 aWk := GenerateRandomMatrix(nModelDim, nHeadDim)
聽 聽 聽 聽 aWv := GenerateRandomMatrix(nModelDim, nHeadDim)
聽 聽 聽 聽 
聽 聽 聽 聽 aQ := LinearTransformation(aInput, aWq)
聽 聽 聽 聽 aK := LinearTransformation(aInput, aWk)
聽 聽 聽 聽 aV := LinearTransformation(aInput, aWv)
聽 聽 聽 聽 
聽 聽 聽 聽 aAttentionScores := CalculateAttentionScores(aQ, aK)
聽 聽 聽 聽 aHeadOutput := ApplyAttention(aAttentionScores, aV)
聽 聽 聽 聽 
聽 聽 聽 聽 AAdd(aOutputs, aHeadOutput)
聽 聽 NEXT
聽 聽 
聽 聽 aFinalOutput := ConcatenateOutputs(aOutputs)
聽 聽 
聽 聽 aWo := GenerateRandomMatrix(nModelDim, nModelDim)
聽 聽 aFinalOutput := LinearTransformation(aFinalOutput, aWo)

RETURN aFinalOutput

FUNCTION FeedForward(aInput)
聽 聽 LOCAL nBatchSize := Len(aInput), nSeqLen := Len(aInput[1]), nModelDim := Len(aInput[1, 1])
聽 聽 LOCAL nFfDim := nModelDim * 4 // T铆picamente, la dimensi贸n interna es 4 veces la dimensi贸n del modelo
聽 聽 
聽 聽 LOCAL aW1 := GenerateRandomMatrix(nModelDim, nFfDim)
聽 聽 LOCAL aW2 := GenerateRandomMatrix(nFfDim, nModelDim)
聽 聽 
聽 聽 LOCAL aHidden := LinearTransformation(aInput, aW1), aOutput
聽 聽 aHidden := ApplyReLU(aHidden)
聽 聽 aOutput := LinearTransformation(aHidden, aW2)

RETURN aOutput

FUNCTION AddAndNorm(aInput, aResidual)
聽 聽 LOCAL aSum := AddMatrices(aInput, aResidual)
聽 聽 LOCAL aNormalized := LayerNorm(aSum)
RETURN aNormalized

FUNCTION LayerNorm(aInput)
聽 聽 LOCAL nBatchSize := Len(aInput), nSeqLen := Len(aInput[1]), nModelDim := Len(aInput[1, 1])
聽 聽 LOCAL aOutput := Array(nBatchSize, nSeqLen, nModelDim)
聽 聽 LOCAL i, j, k, nMean, nVariance, nEpsilon := (1 * 10^-5)
聽 聽 
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 nMean := CalcMean(aInput[i, j])
聽 聽 聽 聽 聽 聽 nVariance := CalcVariance(aInput[i, j], nMean)
聽 聽 聽 聽 聽 聽 
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nModelDim
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] := (aInput[i, j, k] - nMean) / Sqrt(nVariance + nEpsilon)
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aOutput

FUNCTION GeneratePositionalEncoding(nSeqLen, nModelDim)
聽 聽 LOCAL aEncoding := Array(nSeqLen, nModelDim)
聽 聽 LOCAL i, j, nPos, nI
聽 聽 
聽 聽 FOR i := 1 TO nSeqLen
聽 聽 聽 聽 FOR j := 1 TO nModelDim
聽 聽 聽 聽 聽 聽 nPos := i - 1
聽 聽 聽 聽 聽 聽 nI := j - 1
聽 聽 聽 聽 聽 聽 IF nI % 2 == 0
聽 聽 聽 聽 聽 聽 聽 聽 aEncoding[i, j] := Sin(nPos / (10000 ** (nI / nModelDim)))
聽 聽 聽 聽 聽 聽 ELSE
聽 聽 聽 聽 聽 聽 聽 聽 aEncoding[i, j] := Cos(nPos / (10000 ** ((nI - 1) / nModelDim)))
聽 聽 聽 聽 聽 聽 ENDIF
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aEncoding

FUNCTION AddPositionalEncoding(aInput, aPositionalEncoding)
聽 聽 LOCAL nBatchSize := Len(aInput), nSeqLen := Len(aInput[1]), nModelDim := Len(aInput[1, 1])
聽 聽 LOCAL aOutput := Array(nBatchSize, nSeqLen, nModelDim)
聽 聽 LOCAL i, j, k
聽 聽 
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nModelDim
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] := aInput[i, j, k] + aPositionalEncoding[j, k]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aOutput

FUNCTION LinearTransformation(aX, aW)
聽 聽 LOCAL aResult, i, j, k, nSum
聽 聽 LOCAL nBatchSize := Len(aX), nSeqLen := Len(aX[1])
聽 聽 LOCAL nInDim := Len(aX[1, 1]), nOutDim := Len(aW[1])
聽 聽 
聽 聽 aResult := Array(nBatchSize, nSeqLen, nOutDim)
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nOutDim
聽 聽 聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 聽 聽 FOR l := 1 TO nInDim
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 nSum += aX[i, j, l] * aW[l, k]
聽 聽 聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 聽 聽 aResult[i, j, k] := nSum
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aResult

FUNCTION CalculateAttentionScores(aQ, aK)
聽 聽 LOCAL aScores, i, j, k, l, nSum
聽 聽 LOCAL nBatchSize := Len(aQ), nSeqLen := Len(aQ[1]), nDimK := Len(aQ[1, 1])
聽 聽 
聽 聽 aScores := Array(nBatchSize, nSeqLen, nSeqLen)
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 聽 聽 FOR l := 1 TO nDimK
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 nSum += aQ[i, j, l] * aK[i, k, l]
聽 聽 聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 聽 聽 aScores[i, j, k] := nSum / Sqrt(nDimK)
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT
聽 聽 
聽 聽 aScores := ApplySoftmax(aScores)

RETURN aScores

FUNCTION ApplyAttention(aScores, aV)
聽 聽 LOCAL aOutput, i, j, k, l, nSum
聽 聽 LOCAL nBatchSize := Len(aScores), nSeqLen := Len(aScores[1]), nDimV := Len(aV[1, 1])
聽 聽 
聽 聽 aOutput := Array(nBatchSize, nSeqLen, nDimV)
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nDimV
聽 聽 聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 聽 聽 FOR l := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 nSum += aScores[i, j, l] * aV[i, l, k]
聽 聽 聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] := nSum
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aOutput

FUNCTION ConcatenateOutputs(aOutputs)
聽 聽 LOCAL nBatchSize := Len(aOutputs[1]), nSeqLen := Len(aOutputs[1, 1])
聽 聽 LOCAL nTotalDim := 0, nHeadDim, nHeads := Len(aOutputs)
聽 聽 LOCAL aResult, i, j, k, l, nIndex
聽 聽 
聽 聽 nHeadDim := Len(aOutputs[1, 1, 1])
聽 聽 nTotalDim := nHeadDim * nHeads
聽 聽 
聽 聽 aResult := Array(nBatchSize, nSeqLen, nTotalDim)
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 nIndex := 1
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nHeads
聽 聽 聽 聽 聽 聽 聽 聽 FOR l := 1 TO nHeadDim
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 aResult[i, j, nIndex] := aOutputs[k, i, j, l]
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 nIndex++
聽 聽 聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aResult

FUNCTION ApplyReLU(aInput)
聽 聽 LOCAL aOutput := AClone(aInput)
聽 聽 LOCAL i, j, k
聽 聽 
聽 聽 FOR i := 1 TO Len(aOutput)
聽 聽 聽 聽 FOR j := 1 TO Len(aOutput[i])
聽 聽 聽 聽 聽 聽 FOR k := 1 TO Len(aOutput[i, j])
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] := Max(0, aOutput[i, j, k])
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aOutput

FUNCTION ApplySoftmax(aInput)
聽 聽 LOCAL aOutput := AClone(aInput)
聽 聽 LOCAL i, j, k, nMax, nSum, nBatchSize := Len(aInput), nSeqLen := Len(aInput[1])
聽 聽 
聽 聽 FOR i := 1 TO nBatchSize
聽 聽 聽 聽 FOR j := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 nMax := MaxInArray(aOutput[i, j])
聽 聽 聽 聽 聽 聽 nSum := 0
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] := Exp(aOutput[i, j, k] - nMax)
聽 聽 聽 聽 聽 聽 聽 聽 nSum += aOutput[i, j, k]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 FOR k := 1 TO nSeqLen
聽 聽 聽 聽 聽 聽 聽 聽 aOutput[i, j, k] /= nSum
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aOutput

FUNCTION AddMatrices(aA, aB)
聽 聽 LOCAL aResult := AClone(aA)
聽 聽 LOCAL i, j, k
聽 聽 
聽 聽 FOR i := 1 TO Len(aA)
聽 聽 聽 聽 FOR j := 1 TO Len(aA[i])
聽 聽 聽 聽 聽 聽 FOR k := 1 TO Len(aA[i, j])
聽 聽 聽 聽 聽 聽 聽 聽 aResult[i, j, k] += aB[i, j, k]
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 NEXT

RETURN aResult

FUNCTION GenerateRandomMatrix(nDim1, nDim2, nDim3)
聽 聽 LOCAL aMatrix, i, j, k
聽 聽 
聽 聽 IF nDim3 == NIL
聽 聽 聽 聽 aMatrix := Array(nDim1, nDim2)
聽 聽 聽 聽 FOR i := 1 TO nDim1
聽 聽 聽 聽 聽 聽 FOR j := 1 TO nDim2
聽 聽 聽 聽 聽 聽 聽 聽 aMatrix[i, j] := hb_Random(0, 0.02)
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 ELSE
聽 聽 聽 聽 aMatrix := Array(nDim1, nDim2, nDim3)
聽 聽 聽 聽 FOR i := 1 TO nDim1
聽 聽 聽 聽 聽 聽 FOR j := 1 TO nDim2
聽 聽 聽 聽 聽 聽 聽 聽 FOR k := 1 TO nDim3
聽 聽 聽 聽 聽 聽 聽 聽 聽 聽 aMatrix[i, j, k] := hb_Random(0, 0.02)
聽 聽 聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 NEXT
聽 聽 ENDIF

RETURN aMatrix

FUNCTION CalcMean(aArray)
聽 聽 LOCAL nSum := 0, i
聽 聽 
聽 聽 FOR i := 1 TO Len(aArray)
聽 聽 聽 聽 nSum += aArray[i]
聽 聽 NEXT

RETURN nSum / Len(aArray)

FUNCTION CalcVariance(aArray, nMean)
聽 聽 LOCAL nSum := 0, i
聽 聽 
聽 聽 FOR i := 1 TO Len(aArray)
聽 聽 聽 聽 nSum += (aArray[i] - nMean) ** 2
聽 聽 NEXT

RETURN nSum / Len(aArray)

FUNCTION MaxInArray(aArray)
聽 聽 LOCAL nMax := aArray[1], i
聽 聽 
聽 聽 FOR i := 2 TO Len(aArray)
聽 聽 聽 聽 IF aArray[i] > nMax
聽 聽 聽 聽 聽 聽 nMax := aArray[i]
聽 聽 聽 聽 ENDIF
聽 聽 NEXT

RETURN nMax

FUNCTION PrintMatrix(aMatrix)
聽 聽 LOCAL i, j, k
聽 聽 
聽 聽 FOR i := 1 TO Len(aMatrix)
聽 聽 聽 聽 ? "Batch", i
聽 聽 聽 聽 FOR j := 1 TO Len(aMatrix[i])
聽 聽 聽 聽 聽 聽 ?? " 聽Seq", j, ":"
聽 聽 聽 聽 聽 聽 FOR k := 1 TO Len(aMatrix[i, j])
聽 聽 聽 聽 聽 聽 聽 聽 ?? Round(aMatrix[i, j, k], 4), " "
聽 聽 聽 聽 聽 聽 NEXT
聽 聽 聽 聽 聽 聽 ?
聽 聽 聽 聽 NEXT
聽 聽 聽 聽 ?
聽 聽 NEXT

RETURN NIL
Input con codificaci贸n posicional:
Batch 1 Seq 1 : 0.0014 1.0197 0.0004 1.0142 0.0178 1.0140 0.0108 1.0132
Seq 2 : 0.8541 0.5441 0.1168 1.0027 0.0280 1.0131 0.0070 1.0059
Seq 3 : 0.9097 -0.4140 0.2128 0.9911 0.0317 1.0035 0.0173 1.0191
Seq 4 : 0.1433 -0.9840 0.3029 0.9676 0.0489 1.0189 0.0213 1.0060
Seq 5 : -0.7532 -0.6397 0.3924 0.9272 0.0599 1.0040 0.0056 1.0126


Batch 2 Seq 1 : 0.0183 1.0073 0.0185 1.0091 0.0085 1.0046 0.0082 1.0038
Seq 2 : 0.8497 0.5552 0.1064 0.9975 0.0107 1.0165 0.0106 1.0086
Seq 3 : 0.9163 -0.4070 0.1992 0.9964 0.0240 1.0068 0.0071 1.0143
Seq 4 : 0.1552 -0.9761 0.3045 0.9570 0.0360 1.0150 0.0064 1.0168
Seq 5 : -0.7411 -0.6356 0.4068 0.9296 0.0404 1.0129 0.0205 1.0003


Salida del Encoder:
Batch 1 Seq 1 : -1.0144 1.0087 -1.0139 0.9987 -0.9794 0.9981 -0.9920 0.9942
Seq 2 : 0.6568 -0.0642 -1.0607 1.0069 -1.2676 1.0320 -1.3151 1.0120
Seq 3 : 0.8163 -1.6535 -0.4832 0.9712 -0.8206 0.9948 -0.8464 1.0215
Seq 4 : -0.2712 -2.0317 -0.0200 1.0203 -0.4167 1.1004 -0.4590 1.0780
Seq 5 : -1.5150 -1.3417 0.2130 1.0197 -0.2879 1.1349 -0.3686 1.1456


Batch 2 Seq 1 : -0.9924 1.0021 -0.9892 1.0067 -1.0095 0.9976 -1.0088 0.9934
Seq 2 : 0.6464 -0.0333 -1.0720 0.9920 -1.2932 1.0368 -1.2921 1.0154
Seq 3 : 0.8283 -1.6297 -0.5029 0.9799 -0.8282 0.9997 -0.8583 1.0112
Seq 4 : -0.2511 -2.0214 -0.0156 1.0076 -0.4357 1.0983 -0.4811 1.0991
Seq 5 : -1.5075 -1.3457 0.2309 1.0229 -0.3233 1.1482 -0.3523 1.1269
regards, saludos

Antonio Linares
www.fivetechsoft.com
Posts: 44158
Joined: Thu Oct 06, 2005 05:47 PM
Re: Understanding Transformers
Posted: Sun Jul 07, 2024 06:46 AM
regards, saludos

Antonio Linares
www.fivetechsoft.com

Continue the discussion