Blog Entry © Thursday, February 5, 2026, by James Pate Williams, Jr., Three Toom-Cook 3-Way Multiplication Implementations

I translated some of Marco Bodrato ‘s pseudocode found on his website:

Optimal Toom-Cook Polynomial Multiplication, by Marco Bodrato

To Microsoft Win32 C++. I used Microsoft’s Visual Studio 2022 Community Version.

Balanced Toom-Cook 3-Way Multiplication

U = 7

V = 8

W = 56

U = 9

V = 9

W = 81

U = 18

V = 17

W = 306

U = 123

V = 456

W = 56088

U = 1234

V = 5678

W = 7006652

U = 12345

V = 67890

W = 838102050

U = 123456

V = 789012

W = 97408265472

D:\ToomCookBodrato\x64\Debug\ToomCookBodrato.exe (process 30392) exited with code 0 (0x0).

Unbalanced Toom-Cook 3-Way Multiplication

U = 20

V = 5

W = 100

U = 112

V = 20

W = 2240

U = 1234

V = 567

W = 699678

U = 12345

V = 6789

W = 83810205

U = 123456

V = 78901

W = 9740801856

U = 1234567

V = 890123

W = 1098916481741

U = 12345678

V = 9012345

W = 111263509394910

D:\ToomCookBodrato\x64\Debug\ToomCookBodrato.exe (process 13108) exited with code 0 (0x0).

Asymmetrical Squaring, Splitting in Five

U = 8

W = 64

U = 12

W = 144

U = 321

W = 103041

U = 1234

W = 1522756

U = 54321

W = 2950771041

U = 123456

W = 15241383936

D:\ToomCookBodrato\x64\Debug\ToomCookBodrato.exe (process 27592) exited with code 0 (0x0).

Press any key to close this window . . .

The accompanying source code is released under the GNU General Public License, version 3. The full license text is included in the downloadable archive and is also available at:
https://www.gnu.org/licenses/gpl-3.0.html

/*
   Toom–Cook Multiplication (Bodrato-inspired implementation)
   Copyright (C) 2026 James Pate Williams, Jr.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <https://www.gnu.org/licenses/>.
*/

#pragma once

// Algorithm structure inspired by Marco Bodrato’s Toom–Cook notes.
// See: http://www.bodrato.it/toom-cook/
// All implementation code and comments in this file are by
// James Pate Williams, Jr., BA, BS, Master of Software Engineering,
// Doctor of Philosophy in Computer Science

class Multiplication
{

public:

	static long long BalancedToomCook3Way(
		long long b, long long U, long long V);
	static long long UnbalancedToomCook3Way(
		long long b, long long U, long long V);
	static long long AsymeticalSquaring(
		long long b, long long U);
};

/*
   Toom–Cook Multiplication (Bodrato-inspired implementation)
   Copyright (C) 2026 James Pate Williams, Jr.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <https://www.gnu.org/licenses/>.
*/

#include "pch.h"
#include "Multiplication.h"

static void BaseBDigits(
	long long B, long long u,
	std::vector<long long>& digits)
{
	while (u > 0)
	{
		long long d = u % B;
		u = u / B;
		digits.push_back(d);
	}
}

long long Multiplication::BalancedToomCook3Way(
	long long b, long long U, long long V)
{
	double k = 3.0;
	double logU = log(U) / log(b);
	double logV = log(V) / log(b);
	double floorU = floor(floor(logU) / k);
	double floorV = floor(floor(logV) / k);
	double i = fmax(floorU, floorV) + 1.0;
	long long B = static_cast<long long>(pow(b, i));
	long long W = 0;
	std::vector<long long> UDigits;
	std::vector<long long> VDigits;

	for (long long j = 0; j < 5; j++)
	{
		long long x = 0, y = 0;

		if (j == 0)
			x = 1;
		else if (j == 1)
			x = -2;
		else if (j == 2)
			x = 1;
		else if (j == 3)
			x = -1;
		else
			x = 1;

		if (j == 0)
			y = 1;
		else if (j == 1)
			y = 1;
		else if (j == 2)
			y = 1;
		else if (j == 3)
			y = 1;
		else
			y = 0;

		UDigits.clear();
		VDigits.clear();

		BaseBDigits(B, U, UDigits);
		BaseBDigits(B, V, VDigits);

		long long U0 = 0, U1 = 0, U2 = 0;
		long long V0 = 0, V1 = 0, V2 = 0;

		if (UDigits.size() == 3)
		{
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (VDigits.size() == 3)
		{
			V2 = VDigits[2];
			V1 = VDigits[1];
			V0 = VDigits[0];
		}

		if (UDigits.size() == 2)
		{
			U2 = 0;
			U1 = UDigits[1];
			U2 = UDigits[0];
		}

		if (VDigits.size() == 2)
		{
			V2 = 0;
			V1 = VDigits[1];
			V0 = VDigits[0];
		}

		if (UDigits.size() == 1)
		{
			U2 = 0;
			U1 = 0;
			U0 = UDigits[0];
		}

		if (VDigits.size() == 1)
		{
			V2 = 0;
			V1 = 0;
			V0 = VDigits[0];
		}

		if (UDigits.size() == 0 ||
			VDigits.size() == 0)
			break;

		long long x2 = x * x;
		long long y2 = y * y;
		long long xy = x * y;
		
		U = U2 * x2 + U1 * xy + U0 * y2;
		V = V2 * x2 + V1 * xy + V0 * y2;
		
		if (U == 0 || V == 0)
			break;

		long long W3 = U0 + U2;
		long long W2 = V0 + V2;
		long long W0 = W3 - U1;
		long long W4 = W2 - V1;
		
		W3 = W3 + U1;
		W2 = W2 + V1;

		long long W1 = W3 * W2;

		W2 = W0 * W4;
		W0 = ((W0 + U2) << 1) - U0;
		W4 = ((W4 + V2) << 1) - V0;
		W3 = W0 * W4;
		W0 = U0 * V0;
		W4 = U2 * V2;

		W3 = (W3 - W1) / k;
		W1 = (W1 - W2) >> 1;
		W2 = W2 - W0;
		W3 = ((W2 - W3) >> 1) + (W4 << 1);
		W2 = W2 + W1;

		W3 = W4 * x + W3 * y;
		W1 = W2 * x + W1 * y;
		W1 = W1 - W3;

		W = W3 * x * x2 + W1 * x * y2 + W0 * y2 * y2;
	}

	return W;
}

long long Multiplication::UnbalancedToomCook3Way(
	long long b, long long U, long long V)
{
	double k = 3.0;
	double logU = log(U) / log(b);
	double logV = log(V) / log(b);
	double floorU = floor(floor(logU) / k);
	double floorV = floor(floor(logV) / k);
	double i = fmax(floorU, floorV) + 1.0;
	long long B = static_cast<long long>(pow(b, i));
	long long W = 0;
	std::vector<long long> UDigits;
	std::vector<long long> VDigits;

	for (long long j = 0; j < 5; j++)
	{
		long long x = 0, y = 0;

		if (j == 0)
			x = 1;
		else if (j == 1)
			x = -2;
		else if (j == 2)
			x = 1;
		else if (j == 3)
			x = -1;
		else
			x = 1;

		if (j == 0)
			y = 1;
		else if (j == 1)
			y = 1;
		else if (j == 2)
			y = 1;
		else if (j == 3)
			y = 1;
		else
			y = 0;

		long long U0 = 0, U1 = 0, U2 = 0;
		long long V0 = 0, V1 = 0, V2 = 0;
		long long U3 = 0;
		long long x2 = x * x, y2 = y * y;
		long long x3 = x * x2, y3 = y * y2;

		UDigits.clear();
		VDigits.clear();

		BaseBDigits(B, U, UDigits);
		BaseBDigits(B, V, VDigits);

		if (UDigits.size() == 4)
		{
			U3 = UDigits[3];
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (UDigits.size() == 3)
		{
			U3 = 0;
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (VDigits.size() == 3)
		{
			V2 = VDigits[2];
			V1 = VDigits[1];
			V0 = VDigits[0];
		}

		if (UDigits.size() == 2)
		{
			U3 = 0;
			U2 = 0;
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (VDigits.size() == 2)
		{
			V2 = 0;
			V1 = VDigits[1];
			V0 = VDigits[0];
		}

		if (UDigits.size() == 1)
		{
			U3 = 0;
			U2 = 0;
			U1 = 0;
			U0 = UDigits[0];
		}

		if (VDigits.size() == 1)
		{
			V2 = 0;
			V1 = 0;
			V0 = VDigits[0];
		}

		if (UDigits.size() == 0 ||
			VDigits.size() == 0)
			break;

		U = U3 * x3 + U2 * x2 * y + U1 * x * y2 + U0 * y3;
		V = V1 * x + V0 * y;

		if (U == 0 || V == 0)
			break;

		long long W3 = U0 + U2;
		long long W2 = V0 + V1;
		long long W1 = U1 + U3;
		long long W4 = V0 - V1;
		long long W0 = W3 - W1;
		
		W3 = W3 + W1;
		W1 = W3 * W2;
		W2 = W0 * W4;
		W0 = U0 - (((U1 - ((U2 - U3) << 1)) << 1) << 1);
		W4 = W4 - V1;
		W3 = W0 * W4;
		W0 = U0 * V0;
		W4 = U3 * V1;

		W3 = (W3 - W1) / k;
		W1 = (W1 - W2) >> 1;
		W2 = W2 - W0;
		W3 = ((W2 - W3) >> 1) + (W4 << 1);
		W2 = W2 + W1;

		W3 = W4 * x + W3 * y;
		W1 = W2 * x + W1 * y;
		W1 = W1 - W3;

		W = W3 * x3 + W1 * x * y2 + W0 * y2 * y2;
	}

	return W;
}

long long Multiplication::AsymeticalSquaring(
	long long b, long long U)
{
	double k = 5.0;
	double logU = log(U) / log(b);
	double floorU = floor(floor(logU) / k);
	double i = floorU + 1.0;
	long long B = static_cast<long long>(pow(b, i));
	long long W = 0, Wp = 0;
	std::vector<long long> UDigits;

	for (long long j = 0; j < 9; j++)
	{
		long long x = 0;

		if (j == 0)
			x = 0;
		else if (j == 1)
			x = 0;
		else if (j == 2)
			x = -1;
		else if (j == 3)
			x = 1;
		else if (j == 4 || j == 5 || j == 6)
			continue;
		else if (j == 7 || j == 8)
			x = 1;

		UDigits.clear();
		BaseBDigits(B, U, UDigits);
		
		long long x2 = x * x, x3 = x * x2, x4 = x2 * x2;
		long long U0 = 0, U1 = 0, U2 = 0, U3 = 0, U4 = 0;

		if (UDigits.size() == 5)
		{
			U4 = UDigits[4];
			U3 = UDigits[3];
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (UDigits.size() == 4)
		{
			U4 = 0;
			U3 = UDigits[3];
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (UDigits.size() == 3)
		{
			U4 = 0;
			U3 = 0;
			U2 = UDigits[2];
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (UDigits.size() == 2)
		{
			U4 = 0;
			U3 = 0;
			U2 = 0;
			U1 = UDigits[1];
			U0 = UDigits[0];
		}

		if (UDigits.size() == 1)
		{
			U4 = 0;
			U3 = 0;
			U2 = 0;
			U1 = 0;
			U0 = UDigits[0];
		}

		U = U4 * x4 + U3 * x3 + U2 * x2 + U1 * x + U0;

		long long W0 = U0 - U3;
		long long W1 = U3 - U1;
		long long W6 = U1 - U2;
		long long W4 = U1 + U2;
		long long W5 = W6 - U4;
		long long W3 = W5 + (W0 << 1);
		W0 = W0 - W5;
		W6 = W0 + (W6 << 1);
		long long W7 = W6 + W1;
		W5 = W7 + W1;
		long long W8 = W5 + (W4 << 1);
		W4 = W4 - U4;

		long long W2 = (W4) * (W3);
		W4 = (W6) * (W5);
		W3 = (W7) * (W1);
		W1 = U0 * U1 * 2;
		W7 = U3 * U4 * 2;
		W5 = (W8) * (W8); W6 = (W0) * (W0);
		W0 = U0 * U0;
		W8 = U4 * U4;

		W2 = (W4) * (W3);
		W4 = (W6) * (W5);
		W3 = (W7) * (W1);
		W1 = U0 * U1 * 2;
		W7 = U3 * U4 * 2;
		W5 = (W8) * (W8);
		W6 = (W0) * (W0);
		W0 = U0 * U0;
		W8 = U4 * U4;

		W6 = (W6 + W5) >> 1;
		W5 = W5 - W6;
		W4 = (W4 + W6) >> 1;
		W3 = W3 + (W5 >> 1);
		W6 = W6 - W4;
		W5 = W5 - W3 - W1;
		W4 = W4 - W8 - W0;
		W3 = W3 - W7;
		W2 = W2 - W8 - W1 - W7 + W4 + W5;
		W6 = W6 - W2;

		W = W8 * x4 * x4 + W7 * x3 * x4 + W6 * x3 * x3 +
			W5 * x * x4 + W4 * x4 + W3 * x3 + W2 * x2 + W1 * x + W0;
		
		if (W != 0)
			Wp = W;

		if (Wp != 0)
			break;
	}

	return W;
}

/*
   Toom–Cook Multiplication (Bodrato-inspired implementation)
   Copyright (C) 2026 James Pate Williams, Jr.

   This program is free software: you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program.  If not, see <https://www.gnu.org/licenses/>.
*/

#include "pch.h"
#include <iostream>
#include "Multiplication.h"

int main()
{
    bool balanced = false, square = true;

    if (balanced && !square)
    {
        long long U = 7;
        long long V = 8;
        long long W = 0;
        // W = U * V = 56
        W = Multiplication::BalancedToomCook3Way(10, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 9;
        V = 9;
        W = Multiplication::BalancedToomCook3Way(10, U, V);
        // W = U * V = 81
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 18;
        V = 17;
        W = Multiplication::BalancedToomCook3Way(100, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 306
        U = 123;
        V = 456;
        W = Multiplication::BalancedToomCook3Way(1000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 56,088
        U = 1234;
        V = 5678;
        W = Multiplication::BalancedToomCook3Way(10000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 7,006,652
        U = 12345;
        V = 67890;
        W = Multiplication::BalancedToomCook3Way(100000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 838,102,050
        U = 123456;
        V = 789012;
        W = Multiplication::BalancedToomCook3Way(1000000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 838,102,050
    }

    else if (!balanced && !square)
    {
        long long U = 20;
        long long V = 5;
        long long W;
        // W = U * V = 100
        W = Multiplication::UnbalancedToomCook3Way(100, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 112;
        V = 20;
        W = Multiplication::UnbalancedToomCook3Way(1000, U, V);
        // W = U * V = 2240
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 1234;
        V = 567;
        W = Multiplication::UnbalancedToomCook3Way(10000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 699,678
        U = 12345;
        V = 6789;
        W = Multiplication::UnbalancedToomCook3Way(100000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 83,810,205
        U = 123456;
        V = 78901;
        W = Multiplication::UnbalancedToomCook3Way(1000000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 97,408,265,472
        U = 1234567;
        V = 890123;
        W = Multiplication::UnbalancedToomCook3Way(10000000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 9,740,885,481,741
        U = 12345678;
        V = 9012345;
        W = Multiplication::UnbalancedToomCook3Way(100000000, U, V);
        std::cout << "U = " << U << std::endl;
        std::cout << "V = " << V << std::endl;
        std::cout << "W = " << W << std::endl;
        // W = U * V = 111,263,509,394,910
    }

    else if (square)
    {
        long long U = 8;
        long long W = Multiplication::AsymeticalSquaring(10, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 12;
        W = Multiplication::AsymeticalSquaring(100, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 321;
        W = Multiplication::AsymeticalSquaring(1000, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 1234;
        W = Multiplication::AsymeticalSquaring(10000, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 54321;
        W = Multiplication::AsymeticalSquaring(100000, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
        U = 123456;
        W = Multiplication::AsymeticalSquaring(1000000, U);
        std::cout << "U = " << U << std::endl;
        std::cout << "W = " << W << std::endl;
    }
    return 0;
}

Unknown's avatar

Author: jamespatewilliamsjr

My whole legal name is James Pate Williams, Jr. I was born in LaGrange, Georgia approximately 70 years ago. I barely graduated from LaGrange High School with low marks in June 1971. Later in June 1979, I graduated from LaGrange College with a Bachelor of Arts in Chemistry with a little over a 3 out 4 Grade Point Average (GPA). In the Spring Quarter of 1978, I taught myself how to program a Texas Instruments desktop programmable calculator and in the Summer Quarter of 1978 I taught myself Dayton BASIC (Beginner's All-purpose Symbolic Instruction Code) on LaGrange College's Data General Eclipse minicomputer. I took courses in BASIC in the Fall Quarter of 1978 and FORTRAN IV (Formula Translator IV) in the Winter Quarter of 1979. Professor Kenneth Cooper, a genius poly-scientist taught me a course in the Intel 8085 microprocessor architecture and assembly and machine language. We would hand assemble our programs and insert the resulting machine code into our crude wooden box computer which was designed and built by Professor Cooper. From 1990 to 1994 I earned a Bachelor of Science in Computer Science from LaGrange College. I had a 4 out of 4 GPA in the period 1990 to 1994. I took courses in C, COBOL, and Pascal during my BS work. After graduating from LaGrange College a second time in May 1994, I taught myself C++. In December 1995, I started using the Internet and taught myself client-server programming. I created a website in 1997 which had C and C# implementations of algorithms from the "Handbook of Applied Cryptography" by Alfred J. Menezes, et. al., and some other cryptography and number theory textbooks and treatises.

Leave a comment