Reworkd Transforms into Web Scraping Company for AI’s Future | | | | Turtles AI
Highlights:
- Reworkd specializes in web scraping to provide structured data using AI agents.
- The shift from the initial AgentGPT project was driven by high operational costs.
- The startup has raised a total of $4 million in funding to expand its operations.
- Reworkd takes legal precautions to ensure the data extracted is truly public and accessible.
Reworkd Transforms from Viral Startup to Web Scraping Company for the Future of AI
Reworkd, a Canadian startup founded by Asim Shrestha, Adam Watkins, and Srijan Subedi, quickly adapted its mission from the original AgentGPT project to creating advanced web scraping tools. The change stemmed from the need to manage a massive influx of users and API costs, forcing the startup to reorganize and focus on a more specific and lucrative segment.
The transition from AgentGPT, a free tool allowing users to create autonomous AI agents, to a platform specialized in web scraping was crucial. The startup initially attracted over 100,000 daily users in a week, leading to operational costs of $2,000 per day, a situation that led to the official formation of Reworkd and the need for rapid funding.
The co-founders observed that one of the primary applications of AgentGPT was creating web scrapers, a widely needed but often complicated and expensive task to perform manually. Reworkd thus directed its efforts towards developing AI agents capable of extracting structured data from public websites, responding to the growing demand for data for training AI models.
A practical example: imagine wanting to collect statistics on all NFL players. Each team’s site has a different structure, normally requiring a specific scraper for each site. With Reworkd, simply providing the links and a description of the desired data allows the AI agents to handle the scraping process automatically.
Reworkd’s success has been facilitated by a $2.75 million funding round from investors like Paul Graham, AI Grant, SV Angel, General Catalyst, and Panache Ventures. This adds to a previous $1.25 million investment, bringing the total to $4 million.
Reworkd has hired Rohan Pandey as the principal researcher, a key figure residing in AGI House SF, a major AI innovation hub in the Bay Area. Pandey described Reworkd as a "universal API layer for the Internet," allowing structured access to data from nearly any website, even without specific markup.
However, using web scrapers raises legal issues, particularly regarding the "public" nature of web data. Reworkd has chosen to avoid news content and work selectively with clients to minimize legal risks. The recent legal case where Bright Data won against Meta for scraping Facebook and Instagram profiles suggests that public data is generally considered accessible, but the legal landscape is still evolving.
Reworkd’s investors are optimistic about the company’s growth potential, noting that ongoing technological advancements in AI will help reduce costs and increase the efficiency of the scraping tools developed by the startup. Reworkd aims to keep costs competitive and improve the accuracy of its agents using frameworks like Banana-lyzer, which regularly assesses the scraping process’s accuracy.