A number of news platforms have restricted the Wayback Machine — a crucial tool that helps preserve content from expired web domains — from archiving their stories, raising concerns about what this may mean to efforts to preserve history.
Major outlets like USA Today, Reddit, and the New York Times have explicitly prevented their content from being archived, Wired reported.
The Internet Archive has been around for 30 years and has archived over a trillion web pages.
Some outlets have defended their choice to block the Wayback Machine, a tool within the Internet Archive that allows browsing archived versions of webpages, including websites, articles, and videos, by citing concerns that tech companies could use the data to train artificial intelligence models.
According to an analysis by the AI detection startup Originality AI, 23 major news sites are currently blocking ia_archiverbot, the web crawler used frequently by the Internet Archive.
Among publishers and AI companies, there is an ongoing struggle regarding the legality of using their content to train models without permission.
"The issue is that Times content on the Internet Archive is being used by AI companies in violation of copyright law to directly compete with us," said New York Times spokesperson Graham James. Wired added that the Times declined to clarify whether this was actually happening.
USA Today Co. spokesperson Lark-Marie Anton similarly stated that the company is not specifically blocking the Internet Archive but rather attempting to restrict all scraping bots.
If too many outlets restrict their content from being archived, critical information could vanish for future generations researching significant events or analyzing media from the early 21st century.
"There's no question that the general locking-down of more and more of the public web is impacting society's ability to understand what's going on in our world," said the Internet Archive's Mark Graham, according to Wired.
|
Which of these savings plans for rooftop solar panels would be most appealing for you?
Click your choice to see results and earn rewards to spend on home upgrades. |
Currently, there is no easily available public tool that rivals the Wayback Machine. If it continues to lose access to crucial news sources, the efforts to preserve history could deteriorate, making early digital records, including vital environmental and climate data, harder to access or even lost entirely.
"Y'ALL. This is not a drill. The Wayback Machine is often the only source of webpages that have been scrubbed from the Internet," a user wrote in a post to the social platform X alongside Wired's report.
"The wayback is the last line of defense against scrubbed history and they're just letting it die," one user commented.
Wired reported that individual reporters are beginning to advocate for the Wayback Machine, and a coalition of advocacy groups collected over 100 signatures from journalists to support the efforts.
"With many newspapers ceasing operations, and no clear mechanism for local public libraries to preserve digital content, the responsibility for safeguarding journalism's history increasingly lies with the Internet Archive," the letter stated.
Get TCD's free newsletters for easy tips, smart advice, and a chance to earn $5,000 toward home upgrades. To see more stories like this one, change your Google preferences here.






