1099 stories
·
1 follower

IBM Patented Euler's 200 year old Math Technique

1 Share
LeetArxiv is a successor to Papers With Code after the latter shutdown.
Quick Summary
IBM owns the patent to the use of derivatives to find the convergents of a generalized continued fraction.
Here’s the bizarre thing: all they did was implement a number theory technique by Gauss, Euler and Ramanujan in PyTorch and call backward() on the computation graph.
Now IBM’s patent trolls can charge rent on a math technique that’s existed for over 200 years.

Hey, it’s Murage. I code, analyze papers, and prep marketing material solo at Leetarxiv. Fighting patent trolls was not on my 2025 bingo card. Please consider supporting me directly.

As always, code is available on Google Colab and GitHub.

1.0 Paper Introduction

The 2021 paper CoFrNets: Interpretable Neural Architecture Inspired by Continued Fractions (Puri et al., 2021)1 investigates the use of continued fractions in neural network design.

The paper takes 13 pages to assert: continued fractions (just like mlps) are universal approximators.

The authors reinvent the wheel countless times:

  1. They rebrand continued fractions to ‘ladders’.

  2. They label basic division ‘The 1/z nonlinearity’.

  3. Ultimately, they take the well-defined concept of Generalized Continued Fractions and call them CoFrNets.

Authors rename generalized continued fractions. Taken from page 2 of (Puri et al., 2021)

Honestly, the paper is full of pretentious nonsense like this:

The authors crack jokes while collecting rent on 200 years of math knowledge. Taken from page 2

1.1 Quick Intro to Continued Fraction Expansions

Simple continued fractions are mathematical expressions of the form:

Continued fraction. Taken from John D. Cook

where pn / qn is the nth convergent (Cook, 2022)2.

Continued fractions have been used by mathematicians to:

  1. Approximate Pi (MJD, 2014)3.

    Approximations of Pi taken from WolframAlpha
  2. Design gear systems (Brocot, 1861)4

    • Achille Brocot, a clockmaker, 1861 used continued fractions to design gears for his watches

  3. Even Ramanujan’s math tricks utilised continued fractions (Barrow, 2000)5

Continued fractions are well-studied and previous LeetArxiv guides include (Lehmer, 1931)6 : The Continued Fraction Factorization Method and Stern-Brocot Fractions as a floating-point alternative.

If your background is in AI, a continued fraction looks exactly like a linear layer but the bias term is replaced with another linear layer.

(Jones, 1980)7 defines generalized continued fractions as expressions of the form :

written more economically as :

where a and b can be integers or polynomials.

2.0 Model Architecture

The authors replace the term continued fraction with ‘ladder’ to hide the fact they are reinventing the wheel

The authors simply implement a continued fraction library in Pytorch and call the backward() function on the resulting computation graph.

That is, they chain linear neural network layers and use the reciprocal (not RELU ) as the primary non-linearity.

Then they replace the bias term of the current linear layer with another linear layer. This is a generalized continued fraction.

In Pytorch, their architecture resembles this:

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

class CoFrNet(nn.Module): 
    def __init__(self, input_dim, num_ladders=10, depth=6, num_classes=3, epsilon=0.1):
        super(CoFrNet, self).__init__()
        self.depth = depth
        self.epsilon = epsilon
        self.num_classes = num_classes

        #Linear layers for each step in each ladder
        self.weights = nn.ParameterList([
            nn.Parameter(torch.randn(num_ladders, input_dim)) for _ in range(depth + 1)
        ])

        #Output weights for each class
        self.output_weights = nn.Parameter(torch.randn(num_ladders, num_classes))

    def safe_reciprocal(self, x):
        return torch.sign(x) * 1.0 / torch.clamp(torch.abs(x), min=self.epsilon)

    def forward(self, x):
        batch_size = x.shape[0]
        num_ladders = self.weights[0].shape[0]

        # Compute continued fractions for all ladders
        current = torch.einsum(’nd,bd->bn’, self.weights[self.depth], x)

        # Build continued fractions from bottom to top
        for k in range(self.depth - 1, -1, -1):
            a_k = torch.einsum(’nd,bd->bn’, self.weights[k], x)
            current = a_k + self.safe_reciprocal(current)

        # Linear combination for each class
        output = torch.einsum(’bn,nc->bc’, current, self.output_weights)
        return output

def test_on_waveform():
    # Load Waveform-like dataset
    X, y = make_classification(
        n_samples=5000, n_features=40, n_classes=3, n_informative=10,
        random_state=42
    )

    # Split data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Standardize
    scaler = StandardScaler()
    X_train = scaler.fit_transform(X_train)
    X_test = scaler.transform(X_test)

    # Convert to torch tensors
    X_train = torch.FloatTensor(X_train)
    X_test = torch.FloatTensor(X_test)
    y_train = torch.LongTensor(y_train)
    y_test = torch.LongTensor(y_test)

    # Model
    input_dim = 40
    num_classes = 3
    model = CoFrNet(input_dim, num_ladders=20, depth=6, num_classes=num_classes)

    # Training
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    epochs = 100
    batch_size = 64

    for epoch in range(epochs):
        model.train()
        permutation = torch.randperm(X_train.size()[0])

        for i in range(0, X_train.size()[0], batch_size):
            indices = permutation[i:i+batch_size]
            batch_x, batch_y = X_train[indices], y_train[indices]

            optimizer.zero_grad()
            outputs = model(batch_x)
            loss = criterion(outputs, batch_y)
            loss.backward()
            optimizer.step()

        # Validation
        if epoch % 10 == 0:
            model.eval()
            with torch.no_grad():
                train_outputs = model(X_train)
                train_preds = torch.argmax(train_outputs, dim=1)
                train_acc = (train_preds == y_train).float().mean()

                test_outputs = model(X_test)
                test_preds = torch.argmax(test_outputs, dim=1)
                test_acc = (test_preds == y_test).float().mean()

            print(f’Epoch {epoch:3d} | Loss: {loss.item():.4f} | Train Acc: {train_acc:.4f} | Test Acc: {test_acc:.4f}’)

    print(f”\nFinal Test Accuracy: {test_acc:.4f}”)
    return test_acc.item()

if __name__ == “__main__”:
    accuracy = test_on_waveform()
    print(f”CoFrNet achieved {accuracy:.1%} accuracy on Waveform dataset”)

3.0 Results

Testing on a non-linear waveform dataset, we observe these results:

CoFrNet learns a non-linear dataset

An accuracy of 61%.

Nowhere near SOTA and that’s expected.

Continued fractions are well-studied and any number theorist would tell you the gradients vanish ie there are limits to the differentiability of the power series.

The authors use power series of continued fractions to interpret their moderate success. Taken from page 6 of (Puri et al., 2021)

Even Euler’s original work (Euler, 1785)8 allude to this fact: it is an infinite series so optimization by differentiation has its limits.

Pytorch’s autodiff engine replaces the differentiabl series with a differentiable computational graph.

The authors simply implemented a continued fraction library in Pytorch and as expected, saw the gradients could be optimized.

4.0 The Patent

Patent application for Continued Fractions. Taken from Justia Patents

As the reviewers note, the idea seems novel but the technique is nowhere near SOTA and the truth is, continued fractions have existed for a while. They simply replace the linear layers of a neural network with generalized continued fractions.

Here’s the bizarre outcome: the authors filed for a patent on their ‘buzzword-laden’ paper in 2022.

The patent has been published on Google Patents.

Their patent was published and its status marked as pending.

Here’s the thing:

  1. Continued fractions have existed longer than IBM.

  2. Differentiablity of continued fractions is well-known.

  3. The authors did not do anything different from Euler’s 1785 work.

    • Generalized continued fractions can take anything as inputs. It can be integers, or the CIFAR-10 dataset. That’s what the ‘generalized’ means.

Now, If IBM feels litigious they can sue Sage, Mathematica, Wolfram or even you for coding a 249 year old math technique.

4.1 Who is affected by IBM’s Patent?

  1. Mechanical engineers, Robotics and Industrialists

    • Continued fractions are used to find the best number of teeth for interlocking gears (Moore, 1964)9. If you happen to use the derivative to optimize your fraction selection then you’re affected

    Taken from page 30 of An Introduction to Continued Fractions (Moore, 1964)
  2. Pure Mathematicians and Math Educators

    I’m a Math PhD and I learnt about the patent while investigating Continued Fractions and their relation to elliptic curves (van der Poorten, 2004)10.

    I was trying to model an elliptic divisibilty sequence in Python (using Pytorch) and that’s how I learnt of IBM’s patent.

    Abstract for the 2004 paper Elliptic Curves and Continued Fractions (van der Poorten, 2004)
  3. Numerical Analysts and Computation Scientists/Sage and Maple Programmers

    Numerical analysis is the use of computer algorithms to approximate solutions to math and physics problems (Shi, 2024)11.

    Continued fractions are used in error analysis when evaluating integrals and entire books describe these algorithms (Cuyt et al., 2008)12.

Join the fight against IBM’s patent trolls

References

1

Puri, I., Dhurandhar, A., Pedapati, T., Shanmugam, K., Wei, D., & Varshney, K. R. (2021). CoFrNets: Interpretable neural architecture inspired by continued fractions. In A. Beygelzimer, Y. Dauphin, P. Liang, & J. Wortman Vaughan (Eds.), Advances in neural information processing systems. https://openreview.net/forum?id=kGXlIEQgvC

2

Cook, J. (2022). Continued fractions as matrix products. Blog Post.

3

MJD. (2014). How to find continued fraction of pi. Mathematics Stack Exchange. https://math.stackexchange.com/q/716976

4

Brocot, A. (1861). Calcul des rouages par approximation, Nouvelle méthode. Revue chronomeétrique, 3. 186-94.

5

Barrow, J. (2000). Chaos in Numberland: The secret life of continued fractions. Link.

6

Jones, William B., and W. J. Thron (1980). Continued Fractions: Analytic Theory and Applications. Cambridge University Press.

7

Lehmer, D. H., & Powers, R. E. (1931). On factoring large numbers. Bulletin of the American Mathematical Society, 37(10), 770–776.

8

Euler, L. (1785). De transformatione serierum in fractiones continuas, ubi simul haec theoria non mediocriter amplificatur (D. W. File, Trans., 2004). Department of Mathematics, The Ohio State University. (Original work published 1785)

9

Moore, C. (1964). An Introduction to Continued Fractions. National Council of Teachers of Mathematics. Link.

10

van der Poorten, A. J. (2004). Elliptic curves and continued fractions [Preprint]. arXiv. https://arxiv.org/abs/math/0403225

11

Shi, A., (2024). Numerical Analysis (Math 128a). UC Berkeley. Link.

12

Cuyt, A., Petersen, V. B., Verdonk, B., Waadeland, H., & Jones, W. B. (2008). Handbook of continued fractions for special functions. Springer.





Download audio: https://api.substack.com/feed/podcast/178242842/60adefee518c48b9aa7252d6ec10aa20.mp3
Read the whole story
mrmarchant
2 hours ago
reply
Share this story
Delete

Needy Programs

1 Share

If you’ve been around, you might’ve noticed that our relationships with programs have changed.

Older programs were all about what you need: you can do this, that, whatever you want, just let me know. You were in control, you were giving orders, and programs obeyed.

But recently (a decade, more or less), this relationship has subtly changed. Newer programs (which are called apps now, yes, I know) started to want things from you.

Accounts

The most obvious example is user accounts. In most cases, I, as a user, don’t need an account. Yet programs keep insisting that I, not them, “need” one.

I don’t. I have more accounts already than a population of a small town. This is something you want, not me.

The only correct reaction to an account screen

And even if you give up and create one, they will never leave you alone: they’ll ask for 2FA, then for password rotation, then will log you out for no good reason. You’ll never see the end of it either way.

This got so bad that when a program doesn’t ask you to create an account, it feels refreshing.

“Okay, but accounts are still needed to sync stuff between machines.”

Wrong. Syncthing is a secure, multi-machine distributed app and yet doesn’t need an account.

“Okay, but you still need an account if you pay for a subscription?”

Mullvad VPN accepts payments and yet didn’t ask me for my email.

How come these apps can go without an account, but your code editor and your terminal can’t?

Updates

Every program has an update mechanism now. Everybody is checking for updates all the time. Some notoriously bad ones lock you out until you update. You get notified a few seconds after a new version is available.

And yet: do we, users, really need these updates? Did we ask for them?

I’ve been running barebone Nvidia drivers without their bloated desktop app (partly because it asks for an account, lol).

As a result, there’s nobody to notify me about new drivers. And you know what? It’s been fine. I could forget to update for months, and still everything works. It’s the most relaxing I’ve felt in a while.

Even terminal programs bother you with updates now.

There has been a new major release of Syncthing in August. How did I learn about it? By accident; a friend told me. And you know what? I’m happy with that. If I upgrade, nothing in my life will change. It works just fine now. So do I really need an update? Is it my need?

It’s simple, really. If I need an update, I will know it: I’ll encounter a bug or a lack of functionality. Then I’ll go and update.

Until then, politely fuck off.

Notifications

Notifications are the ultimate example of neediness: a program, a mechanical, lifeless thing, an unanimate object, is bothering its master about something the master didn’t ask for. Hey, who is more important here, a human or a machine?

Notifications are like email: to-do items that are forced on you by another party. Hey, it’s not my job to dismiss your notifications!

I just downloaded this and already have three notifications to dismiss.

Sure, there are good notifications. Sometimes users need to be notified about something they care about, like the end of a long-running process.

But the general pattern is so badly abused that it’s hard to justify it now. You can make a case that giving a toddler a gun can help it protect itself. But much worse things will probably happen much sooner.

These fucking dots.

There’s no good reason why, e.g. code editor needs a notification system. What’s there to notify about? Updates? Sublime Text has no notifications. And you know what? It works just fine. I never felt underinformed while using it.

The ultimate example: account, update, and notification

Onboarding

The company needs to announce a new feature and makes a popup window about it.

Read this again: The company. Needs. It’s not even about the user. Never has been.

What’s new in Calendar? I don’t know, 13th month?

Did I ask about Copilot? No. The company wants me to use it. Not me:

Do I care about Figma Make? Not really, no.

Yet I still know about it, against my will.

To sum it up

I’ve read somewhere (sorry, lost the link):

ls never asks you to create an account or to update.

I agree. ls is a good program. ls is a tool. It does what I need it to do and stays quiet otherwise. I use it; it doesn’t use me. That’s a good, healthy relationship.

At the other end of the spectrum, we have services. Programs that constantly update. Programs that have news, that “keep you informed”. Programs that need something from you all the time. Programs that update Terms of Service just to remind you of themselves.

Programs that have their own agenda and that are trying to make it yours, too. Programs that want you to think about them. Programs that think they are entitled to a part of your attention. “Pick me” programs.

And you know what? Fuck these programs. Give me back my computer.

Read the whole story
mrmarchant
2 hours ago
reply
Share this story
Delete

I'm sorry, but you do not have enough coins for democracy

1 Share
Comments
Read the whole story
mrmarchant
2 hours ago
reply
Share this story
Delete

Midy’s Theorem

1 Share

The decimal expansion of 1/7 is

0.142857142857 …

Interestingly, if you split the repeating decimal period in half and add the two complements, you get a string of 9s:

142 + 857 = 999

It turns out this is true for every fraction with a prime denominator and a repeating decimal period of even length:

1/11 = 0.090909 …
0 + 9 = 9

1/13 = 076923 …
076 + 923 = 999

1/17 = 0.0588235294117647 …
05882352 + 94117647 = 99999999

1/19 = 0.052631578947368421 …
052631578 + 947368421 = 999999999

It was discovered by French mathematician E. Midy in 1836.

Read the whole story
mrmarchant
17 hours ago
reply
Share this story
Delete

All Aboard!

1 Share

Today’s guest issue is from Vedad Siljak, a type designer running Chill Type from Salzburg, Austria, as well as a fellow archive enthusiast who discovered a treasure trove of toy catalogs.


I first learned about Tyco Toys from a YouTube video on the history of Lego. Tyco Super Blocks were one of their competitors in the ’80s and the ads featured in the video boasted about their superiority. The ads didn't necessarily catch my eye, but the logotype did: An ultra bold geometric 4-letter word with 2 ligatures in it, one of them (the CO) a tiny gap shy of becoming an infinity symbol. My excitement was immeasurable.

As I was looking for the logo online, a whole new world started unraveling itself. The Internet Archive has a big collection of Tyco catalogs ranging from the '60s to the '80s. They often feature the logo very prominently on the cover as it adapts itself to the art style around it. On one cover it gets a hand drawn outline, on the next one it serves as a divider between two worlds of toys and on yet another one it becomes part of the backdrop as a physical prop for the photo shoot. (On the 1978 cover it serves as a bridge AND as a divider.)

A look inside the catalogs reveals more type love from Tyco. The Chattanooga Choo- Choo, Curve Huggers and Nite-Glow each have their own distinct logotype. There's also whole spreads were the headlines become intertwined with the toys shown on the pages. Not only does it look really cool, it also helps establish more depth to the images, as the letters align with the vanishing points of the different compositions. It's always exciting to see type treated with this much care, as it helps paint a more nuanced picture and adds to the immersion of the world being built in front of you.


Vedad’s featured archive is Fonts In Use. It's one of the best places to explore how fonts are used in the real world and has entries dating back to the 1500s. It's also one of the most promising starting points when trying to identify unknown fonts.

Sources: Author’s scans


Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete

UC San Diego Reports 'Steep Decline' in Student Academic Preparation

1 Share
The University of California, San Diego has documented a steep decline in the academic preparation of its entering freshmen over the past five years, according to a report [PDF] released this month by the campus's Senate-Administration Working Group on Admissions. Between 2020 and 2025, the number of students whose math skills fall below middle-school level increased nearly thirtyfold, from roughly 30 to 921 students. These students now represent one in eight members of the entering cohort. The Mathematics Department redesigned its remedial program this year to focus entirely on elementary and middle school content after discovering students struggled with basic fractions and could not perform arithmetic operations taught in grades one through eight. The deterioration extends beyond mathematics. Nearly one in five domestic freshmen required remedial writing instruction in 2024, returning to pre-pandemic levels after a brief decline. Faculty across disciplines report students increasingly struggle to engage with longer and complex texts. The decline coincided with multiple disrupting factors. The COVID-19 pandemic forced remote learning starting in spring 2020. The UC system eliminated SAT and ACT requirements in 2021. High school grade inflation accelerated during this period, leaving transcripts unreliable as indicators of actual preparation. UC San Diego simultaneously doubled its enrollment from under-resourced high schools designated LCFF+, admitting more such students than any other UC campus between 2022 and 2024. The working group concluded that admitting large numbers of underprepared students risks harming those students while straining limited instructional resources. The report recommends developing predictive models to identify at-risk applicants and calls for the UC system to reconsider standardized testing requirements.

Read more of this story at Slashdot.

Read the whole story
mrmarchant
1 day ago
reply
Share this story
Delete
Next Page of Stories