AI-powered search startup Perplexity – widely available for businesses and individuals in Australia – was recently valued at A$4.49 billion, has now been accused of scraping content from websites without full consent.
Amazon Web Services has begun an investigation to determine whether Perplexity AI is breaking its rules, reported Wired.
AWS is looking into allegations that Perplexity is using a crawler, hosted on Amazon’s servers, that ignores the Robots Exclusion Protocol.
The protocol is a web standard, wherein developers put a robots.txt file on a domain containing instructions on whether bots can or can’t access a particular page.
Although complying with those instructions is voluntary, crawlers generally respect those requests.
Wired had previously reported that it discovered a virtual machine bypassing its website’s robots.txt instructions. That machine was hosted on an AWS server using the IP address 44.221.181.252 that’s “certainly operated by Perplexity.”
“AWS’s terms of service prohibit abusive and illegal activities and our customers are responsible for complying with those terms,” Amazon Web Services told Engadget in a statement. “We routinely receive reports of alleged abuse from a variety of sources and engage our customers to understand those reports.”
Perplexity spokesperson Sara Platnick told Wired that the company has already responded to Amazon’s inquiries and denied that its crawlers are bypassing the Robots Exclusion Protocol. “Our PerplexityBot — which runs on AWS — respects robots.txt, and we confirmed that Perplexity-controlled services are not crawling in any way that violates AWS Terms of Service,” she said.
The investigation is significant for Australia given that there is no current oversight over Perplexity, which gained unicorn status earlier this year, and is seen as a rival to the likes of ChatGPT and Google. Perplexity, founded in 2022, has already been backed by the likes of the Jeff Bezos family fund and Nvidia.
Perplexity has an annualised revenue rate of between A$22.47 million and $29.96 million. The majority of that revenue reportedly comes from selling subscriptions to its Perplexity Pro service, which offers a search co-pilot, and direct access to powerful third-party large language models, including OpenAI’s GPT-4 and Meta’s open-source Llama-3 LLM.