Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Majority of sites don't even expose accessibility functionalities, and for WebMCP you have to expose and maintain internal APIs per page. This opens the site up to abuse/scraping/etc.

Thats why I dont see this standard going to takeoff.

Google put it out there to see uptake. Its really fun to talk about but will be forgotten by end of year is my hot take.

Rather what I think will be the future is that each website will have its own web agent to conversationally get tasks done on the site without you having to figure out how the site works. This is the thesis for Rover (rover.rtrvr.ai), our embeddable web agent with which any site can add a web agent that can type/click/fill by just adding a script tag.

 help



> for WebMCP you have to expose and maintain internal APIs per page

Perhaps. I think an API for the session is probably the root concern. Page specific is nice to have.

You say it like it's a bad thing. But ideally this also brings clarity & purpose to your own API design too! Ideally there is conjunct purpose! And perhaps shared mechanism!

> This opens the site up to abuse/scraping/etc.

In general it bothers me that this is regarded as a problem at all. In principle, sites that try to clickjack & prevent people from downloading images or whatever have been with us for decades. Trying to keep users from seeing what data they want is, generally, not something I favor.

I'd like to see some positive reward cycles begin, where sites let users do more, enable them to get what they want more quickly, in ways that work better for them.

The web is so unique in that users often can reject being corralled and cajoled. That they have some choice. A lot of businesses being the old app-centric "we determine the user experience" ego to the web when they work, but, imo, there's such a symbiosis to be won by both parties by actually enhancing user agency, rather than this war against your most engaged users.

This also could be a great way to avoid scraping and abuse, by offering a better system of access so people don't feel like they need to scrape your site to get what they want.

> Rather what I think will be the future is that each website will have its own web agent to conversationally get tasks done on the site without you having to figure out how the site works

For someone who just was talking about abuse, this seems like a surprising idea. Your site running its own agent is going to take a lot of resources!! Insuring those resources go to what is mutually beneficial to you both seems... difficult.

It also, imo, misses the idea of what MCP is. MCP is a tool calling system, and usually, it's not just one tool involved! If an agent is using webmcp to send contacts from one MCP system into a party planning webmcp, that whole flow is interesting and compelling because the agent can orchestrate across multiple systems.

Trying to build your own agent is, broadly, imo, a terrible idea, that will never allow the user to wield the connected agency they would want to be bringing. What's so exciting an interesting about the agent age is that the walls and borders of software are crumbling down, and software is intertwingularizing, is soft & malleable again. You need to meet users & agents where they are at, if you want to participate in this new age of software.


> You say it like it's a bad thing. But ideally this also brings clarity & purpose to your own API design too! Ideally there is conjunct purpose! And perhaps shared mechanism!

I update my website multiple times a day. I want to have as much decoupling as possible. Everytime I update internal API, I dont want to think of having to also update this WebMCP config.

Basically I have to put in work setting up WebMCP, so that Google can have a better agent that disintermediates my site.

> Trying to keep users from seeing what data they want is, generally, not something I favor.

This is literally the whole cat and mouse game of scraping and web automation, sites clearly want to protect their moat and differentiators. LinkedIn/X/Google literally sue people for scraping, I don't think they themselves are going to package all this data as a WebMCP endpoint for easy scraping.

Regardless of your preferences/ideals, the ecosystem is not going to change overnight due to hype about agents.

> Your site running its own agent is going to take a lot of resources

A lot of sites already expose chatbots, its trivial to rate limit and captcha on abuse detection


But we have OpenAPI at home

OpenAPI is a replacement for web browsing. Mostly for businesses. WebMCP nicely supplements your web browsing.

Explain.

WebMCP is mediated by the browser/page & has the full context of the user's active page/session available to it.

Websites that do offer real APIs usually have them as fairly separate things from the web's interface. So there's this big usability gap, where what you do on the API doesn't show up clearly on the web. If the user is just hitting API endpoints unofficially, it can create even worse unexpected split brain problems!

WebMCP offers something new: programmatic control endpoints that work well with what the user is actually seeing. A carefully crafted API can offer that, but this seamless interoperation of browsing and webmcp programmatic control is a novel very low impedance tie together that I find greatly promising for users, in a way that APIs never were.

And the starting point is far far less technical, which again just reduces that impedance mismatch that is so daunting about APIs.


The whole point of an agent, though, is to overcome obstacles to accomplish tasks on your behalf. And since an agent is a computer program, the most efficient way to accomplish tasks using computer services is though APIs. Websites are first and foremost human interfaces, not computer interfaces.

Having an agent use a browser to accomplish tasks on the principal’s behalf is a backstop. It’s for when service providers refuse to implement APIs—and they frequently refuse to do this on purpose. And I expect they will continue to make it as difficult as possible for agents to automate website-based extraction for the same reason they don’t provide APIs. If you thought Captcha solving was a nuisance already, expect it to get worse.


I think that is incredibly foolish a perspective. Rooted in old ridiculous slipshod biases, with no respect for users & their agency, and makes unsupported weak technical arguments that define away the possibility of APIs being anything but better.

> the most efficient way to accomplish tasks using computer services is though APIs

You don't state efficient at what, so I'll first argue you best case: energy efficient, least amount of computing done. Both provide mechanistic access. If the user already had the browser open and is going for help, the difference is nearly nothing. It's different wire formats. We are talking the smallest tiniest peanuts of difference. Arguing this either way is not worth the bits such argument would be stored on; it's trivial.

But this misses the broader view. Efficient at what? And I think you are thousands of miles of off, have reduced LLM's to an idealized state, that is starkly naive to what the job actually is.

First, let's go through the rest of the shit field of bad definitions and terms you have laid down to avoid having to think about or address any of the possibilities of webmcp and how it could be apt.

> Websites are first and foremost human interfaces, not computer interfaces

Which is why webmcp is a valuable contribution, so now the web page can have parity with all the other tools offered to an LLM. So that now you can stay on the page and still have a fantastic first class machine interface, from the page you are on.

> [Web browsing control] is for when service providers refuse to implement APIs—

Which WebMCP is a direct answer to, by allowing pages to offer a low friction access path that allows mechanistic control. Without the LLM having to "backstop" scrape and parse and puppeteer/playwright/devtools-protocol it's way through.

I suspect you are right that many players out there will seek control & domination of their users, and will reject webmcp and be layering on more constraints. This isn't an argument against webmcp. It's a moral/philosophical/economic statement of where the world is today, of the battle of intermediation/control capitalism that actively works against humanity/agency. WebMCP is a protocol to help agency & tools become more ubiquitous, more regular, more human, more natural. If it works, it makes the intermediation/control camp look bad. The good sites helping their users make mockery of those who keep layering anti-user anti-freedom hostility into their systems. WebMCP amplifies this struggle by making doing good and right things easier for sites, that is more visible and clear to users. Will eventually the bad people clamping down on hackery freedom eventually hear the music, reform their sick anti-human anti-possibility high-control ways? Or will they continue the path of eternal degredation? Unknown. But WebMCP makes better relations with sites possible. (Hopefully there is peril to ignoring this betterment.)

Sibfeel like I've tried to address what seem to me to be significant misses and misdirections you have put out.

Instead of tripping over what has been, lets finally get to the two aspects of users and their LLM agents that I think are crucial to assessing the potential value of WebMCP:

1. LLM's are adaptable & guidable. They are peers that we work with; there is more possible than a once off assignment of tasks. Our human agency is most amplified when we can interact and steer the course alongside the agent, when we can form opinions on its work. Driving a website that the user knows and is familiar is a shared medium that the agent and the human can work together on, refining as we go, to get to a success state.

If the agent is using an API, they have to craft a de-novo interface at every step of the process, either as text responses or MCP UI or other. The agent has to reinterpret and describe: it can't just show us what is, short of showing us OpenAPI definitions and json payloads.

2. I've already talked about the process, but the definition of done in "accomplish tasks on your behalf" also insufficiently describes what LLMs need to do. Accomplishing the task is only part of the job: giving the results to the user, showing them the final state is a key part of the agent+human work-cycle. Verifying the results is vital! Agents make all kinds of incorrect assumptions as they go, need real help! How does the LLM prove it sent the strawberry muffin recipe to grandma? If there is an api, the agent can say the request responded 200. But was it the right request? Using APIs means having to have undeservedly high levels of trust in the agent. Layering agency onto the web allows the agent to perform, in a way that users can see and gain the knowledge/insight & verification at the end of the process quickly.

> Having an agent use a browser to accomplish tasks on the principal’s behalf is a backstop

In conclusion, I argue that this is a deep misunderstanding of what the agent's role is. It is a co-partner to us humans, helping us not by achieving tasks on its own independently, but by working actively along side is in a multiplayer fashion, as a peer, not a distant autonomous system. Turning the web into a shared medium where users and agents can work together would greatly enhance LLM's ability to meaningfully accomplish their tasks alongside their humans, and would improve accomplishing the task of telling the human about it after, by giving the human the well known trustworthy interface they already are familiar with.


Wow, that was a lot of words.

> I think that is incredibly foolish a perspective. Rooted in old ridiculous slipshod biases, with no respect for users & their agency, and makes unsupported weak technical arguments that define away the possibility of APIs being anything but better.

Since this is a technical discussion, let's debate these based on their technological pros and cons, and avoid the characterizations, shall we?

> You don't state efficient at what, so I'll first argue you best case: energy efficient, least amount of computing done.

Yup.

> Both provide mechanistic access. If the user already had the browser open and is going for help, the difference is nearly nothing. It's different wire formats. We are talking the smallest tiniest peanuts of difference.

The difference may be "nearly nothing" at individual scale but not at global scale. The aggregate difference in energy and data transfer required to power a full browser experience vs. APIs is enormous. If it weren't true, Google, Amazon, and Meta wouldn't have spent nearly as much blood and treasure in optimizations, both in hardware and software, as they have over the last 25+ years. You can't just hand-wave this away. If you told Google and Meta that gRPC and Thrift were "peanuts of difference" and "trivial" they'd laugh in your face and show you the door. (You can always tell when someone's not an experienced engineer as soon as they bandy about the word "trivial.")

Again, browser-based interfaces are for humans. They change frequently, often at the whim of designers. As they evolve, agents must evolve with them. That sort of instability contributes to the resources needed to mechanize them. Compare against APIs, which often have stability guarantees, or at the very least, are only additive over time.

> Which WebMCP is a direct answer to, by allowing pages to offer a low friction access path that allows mechanistic control. Without the LLM having to "backstop" scrape and parse and puppeteer/playwright/devtools-protocol it's way through.

This I understand. But APIs are even more efficient still.

> If the agent is using an API, they have to craft a de-novo interface at every step of the process, either as text responses or MCP UI or other. The agent has to reinterpret and describe: it can't just show us what is, short of showing us OpenAPI definitions and json payloads.

I think you may be underestimating the extent to which this will need to happen with browser-based MCP connectivity as well.

Unfortunately I don't have the time to dive deep into the rest of your comment, as it's just too verbose and narrative-driven. If you'd like to make concise and concrete technological arguments, though, I'm open to that.


No, the characterization is very important. You've shown no connection to what's actually at stake, to the engagement patterns here, to the need for people to actually use agents in a way they understand, to the needs to work through & arrive together with your agent at an answer. We cant have a technical discussion until you actually show some engagement in the core topics, but you have been too busy raising frivolous objections to derail anyone thinking about the actual topic and technology.

Your proposal to use APIs is a grossly inefficient waste of LLM's time and energy, and far worse, a misuse of human attention that could be much better directed with the multiplayer/coop/peership of webmcp. You propose inventing brand new communication systems for every interaction, and haven't once considered the merits of leveraging the existing communication medium that users know. Rather than engage in WebMCP & what it brings, it's been trying to hide and confuse the matter & bury any discussion under a sea of objections, objections that don't even carry technical merit. If you want to actually reply to any of the interesting things rather than blocking and obstructing discussion, I'll happily re-engage.

I've found everything you have said to be radically damaging to understanding the problems that be, by vastly limiting consideration away from all interesting topics and raising only naysaying quibbles that don't address how users and agents would actually do work. Users and agents need to work together. That's simple, and your posts actively distract from what's unique and different here. I'm not going to accept another null response and then waste my time again, and it's sad that people have been steered away from thoughtful consideration like this.


If your argument has merit, we will see it win in the marketplace. If it doesn’t, then it will not. Simple as that. And I’m definitely not the only one who is looking for an explanation of why an agent-browser interface is the superior approach vis-a-vis the alternatives.

I’m not entirely sure what your angle is, but your tirade makes it sound like you’re emotionally invested in this (and potentially financially invested) and you’re frightened. A confident person doesn’t need all these histrionics.


I just really dislike the uselessness of people who naysay & dont engage! This poor world suffers SO MUCH from Brandolini's Law, from bad information being so easy to create. My heart is torn by bad engagement, by misdirection, away from the good and the interesting and the possible, and there's such an asymmetry that the truth and possibility face, so many ways for potential to be sapped and drained.

Hackers deserve better than such. There is a moral spiritual calling they ought feel to want to explore & think.

I do think WebMCP faces extremely long odds against success. It's incredibly unlikely to win. You started this by talking about companies wanting to do the wrong thing, by discussing how they hate giving users freedom to use the web as they want: WebMCP runs up against that problem. It only wins if a critical mass of users adopt it & can advocate for it, find it better enough & find enough voice to get it adopted anyways. That seems super unlikely. Your practical objection is most real, and part of the brutal badness of this reality. The odds of success only get far worse from there: I don't think a lot of users will have on-ramps to use this technology well. Very few users understand tool calling, very few will have interesting extensions or systems to make use of WebMCP. Especially with mobile browsers often not supporting extensions.

Once again I think you are just so off the mark on the other thing though: 'Let the market see' is wildly out of the spirit of a hackerly discussion. We ought assess for ourselves, be using this space to try to figure out what is good, and what we want to win, and why, on what merits. We ought be calibrating and pushing, trying to develop our thoughts. Humankind the toolmaker is meant to explore, to understand; that's why I dislike naysaying & non-engagement so much. It's against my spiritual values, against in my view the best parts of our nature.

Possibility and good is delicate. Seeing unengaged unthoughtful disregard of it does get to my heart.


There have been thousands-millions of proposals since the dawn of the internet that got nowhere.

To exist is to recognize the material constraints of reality; there are things humans won't ever discover. Ergo we have to prioritize what is useful.

This proposal is not useful. It goes against the fundamental interest of website owners to differentiate and build up a moat around direct user relationship and data. WebMCP is frankly just a land grab attempt by Google to get more free stuff from publishers.


Sadly I do see this slop taking off purely because something something AI, investors, shareholders, hype. I mean even the Chrome devtools now push AI in my face at least once a week, so the slop has saturated all the layers.

They don't give a fuck about accessibility unless it results in fines. Otherwise it's totally invisible to them. AI on the other hand is everywhere at the moment.


This isn’t even MCP, it’s just tools. If it were real MCP of definitely have fun using the “sampling” feature of MCP with people who visit my site…

IYKYK




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: