• The Lane
  • Posts
  • The Internet's New Plumbing (Part 1)

The Internet's New Plumbing (Part 1)

The company solving one of AI's biggest problems.

Since the inception of the internet, a great way to start a company was to find a source of free data and build something useful on top of it.

Search engines are a great example.

These engines were built by reading and indexing everyone’s web pages (data they didn’t pay for), and making them searchable. Everyone was largely fine with this, users got a better way to browse the web, websites gained valuable traffic, and search-engines aggregated massive amounts of data that enabled them to become multi-trillion-dollar businesses.

The new AI companies are taking a similar approach. To train a large language model, you have to feed it enormous amounts of text, video, and images. The easiest place to find that is on the open internet. So naturally, they’ve been taking everything they can find.

This seems to be slowing down, partly because they’ve scraped the majority of the open internet, and also because people are starting to fight back.

You see it with the New York Times suing OpenAI, and with Reddit starting to charge for its API. The value chain has always been broken, and content creators are starting to wake up and realize that the data they create is valuable.

So what happens now? It seems likely that AI companies will have to start paying for the data they use (late is better than never).

This creates a surprisingly hard problem. If you run a website, how do you react to this? How do you stop your data from being taken? On the contrary, how do you opt-in to get paid for your content?

For context, the way AI companies scrape data is by using scripts to run “web crawlers”.

Think of these crawlers as little robots that visit websites and capture all the data on the site. They are not always malicious. Google uses these to index and rank sites. If you’ve ever dealt with the term SEO (Search Engine Optimization), you are interacting with crawlers. The whole SEO industry is built on trying to send the right signals to crawlers. By sending the right signals, people rank higher on search engines.

Going back to the identification problem, you can’t just trust the name a crawler gives you. Any script can call itself "Googlebot." You have to identify a crawler by its behaviour. But you can't afford to be wrong. Block the real Googlebot by mistake, and your search-ranking plummets. Let everyone in, and your data gets siphoned for free.

If you’re trying to run a business, this is the last problem you want to spend your time on. It's a massive, expensive game of cat and mouse. By the time you build a solution, your data will likely already be gone.

So naturally, people use a third party company to solve this: Cloudflare ($NET).

Cloudflare already sees roughly 20% of the web's traffic. They sit between millions of websites and the internet, and their core business is figuring out who is a real person, who is a friendly bot, and who is malicious. The more traffic they see, the better their algorithms get at detection. The better the detection, the more websites use them. It’s a powerful network effect.

Cloudflare has a powerful network effect that has made its detection algorithms stronger with time.

So for them, adding a pay-per-crawl service to your website is not a huge technical leap because the infrastructure is already largely in place.

Websites can block unknown bots, and website owners will be able to opt-in to get paid for each crawler that hits their website.

However, filtering and charging crawlers for data is just a first-order consequence of the new AI-enabled era of the internet.

The really interesting part is what I believe happens next: AI agents.

The largest companies in the world are building AI agents. These agents have the ability to act on behalf of the user to get something done.

For those that aren’t familiar, let’s say you want to book a flight to Toronto. You decide to use an AI agent to complete the task. This AI agent will be able to open a web browser, search the web to find the cheapest flight, go to the airline’s website, input your credit card details, and book the flight, all on its own without any help.

This makes the identification problem we talked about an order of magnitude harder. The question is no longer “should I let this bot read my content?”, but “should I let this bot access a user’s account and make a transaction on behalf of the user?”

The stakes become enormously high. You have to be certain the agent is who it says it is and has the user's explicit permission for that specific action.

So how do you build a system to trust these agents? The old ways of thinking about security don’t seem to work. You can’t just give an agent a secret API key, because the key could be stolen or used improperly. You also can’t trust the request based on where it comes from (its IP address) because the request is likely going to be made from a generic server by a big cloud provider. The traditional signals of trust are gone.

Old signals like “where are you” and “what’s your password” are no longer reliable enough, so the only solution is to use a more paranoid model. You have to assume every request is now a hostile one until proven otherwise. You have to check everything, every time. You don’t just check the agent’s identification once, you check it every single time it tries to make an important action. You have to verify who the agent is, who the user it represents is, and if they’ve given the agent permission to complete that specific action right now.

If I’ve lost you at this point, just know that this more secure and paranoid system is called a “Zero Trust” network and right now it seems like the only solution that makes sense for an internet with AI agents coming online.

Cloudflare has built this system.

If you take one thing from this article, it should be this: Cloudflare is in the best position to act as the internet’s intelligent “plumbing”. A good analogy is the global payment network that runs your debit and credit cards.

When you tap your card at a store, the terminal doesn't connect directly to your bank's central vault. It connects to a trusted, global network like Visa or Mastercard. That network acts as an intelligent checkpoint. It instantly verifies your identity (is the card valid?), your authorization (does the PIN or tap match?), and the context (do you have sufficient funds for this specific purchase?). The merchant never actually sees your bank account details; they just get an approval or denial from the trusted network.

That’s conceptually how Cloudflare’s Zero Trust network works. A company's application is like a bank vault. It’s taken completely off the public internet. Instead, it connects to Cloudflare’s network through a secure, outbound-only gate.

So for example, when an AI agent wants to complete a task, like booking a flight, it can't go directly to the airline's application server. It must connect to the central checkpoint which is Cloudflare's edge.

There, Cloudflare enforces the Zero Trust rules, acting like the payment network. It checks the agent's metaphorical "card." It asks the AI agent, who are you? Who are you acting for? Are you authorized for this specific action? Only after every question is verified does it approve the action, opening a temporary, secure connection to let the agent complete its one, specific task.

Simplification of Cloudflare’s Zero Trust network

If you believe in an agentically enabled internet, Cloudflare is best positioned to become a universal identity and transaction checkpoint for the internet.

This is a huge deal.

By almost every traditional metric, Cloudflare at a ~$75B market cap looks very expensive given ~$2B of annual revenue.

It seems to be obviously overvalued based on traditional valuation metrics, but that seems like the wrong way to look at it. What I’m buying with Cloudflare is not just a share of its current business, but a piece of what seems likely to become the standard grid for trusted internet traffic. It's a bet on the plumbing of the internet, and bets on the plumbing are often pretty solid in technological super-cycles.

When Google, Meta, and OpenAI are signalling that agents are the next frontier of AI, Cloudflare seems like an obvious beneficiary to me.

That's why I find the idea so compelling, and why it's the sort of investment I'd be happy to hold for a very long time. I expect it will be volatile; a position like this isn't for the weak-stomached. The company is trading well above fair-value based on current earnings. But oddly enough, corrections are exactly what I want as I’m going to be dollar-cost-averaging for the foreseeable future.

Billions of dollars of value will be created and captured in the next decade by whoever builds the systems that manage this new kind of internet traffic. And right now, Cloudflare seems to be standing in exactly the right place, building exactly that.

Hope you enjoyed.

Stay tuned for a second post relating to Cloudflare, where I dive deeper into edge computing, inference, and national security.

Disclaimer: The preceding is a personal investment memo and represents the opinions of the author. I am a current shareholder of Cloudflare. I am not a financial advisor. This document is for informational and educational purposes only and should not be construed as a recommendation to buy or sell any security. All investors should conduct their own independent research and consult with a qualified financial professional before making any investment decisions.