Challenge

What can we do to prevent linkrot?

Image credit: https://uxdesign.cc/how-to-design-a-404-error-page-that-keeps-users-on-your-site-f3443a980ece

Spook Louw May 24, 2021

Please leave the feedback on this challenge

Necessity

Is the problem still unsolved?

Conciseness

Is it concisely described?

Bounty for the best solution

Provide a bounty for the best solution

Bounties attract serious brainpower to the challenge.

I only recently became aware of the pandemic that is linkrot after reading an article by the Columbia Journalism Review in which they stated that 25% of all links were completely inaccessible. Linkrot became more common over time: 6% of links from 2018 had rotted, as compared to 43% of links from 2008 and 72% of links from 1998. 53% of all articles that contained deep links had at least one rotted link.      
While I assumed that this must be a relatively new phenomenon, I found articles dating as early as 1998, where they found that 6% of the links on the Web were broken at that time and that linkrot in May 1998 was double that found by a similar survey in August 1997.      

Why does this matter?
Well, while a couple of 404 errors might seem like a minor inconvenience, because the problem is exponential, they start adding up, and it isn't only casually shared social media links that suffer from this, 50% of U.S. Supreme Court opinions contain dead links, and so do 70% of Harvard academic journals.      
Because linkrot is not going to stop by itself, we need to find a way to address the issue things like academic research, digital journalism and even law will be greatly affected in the near future if we don't find a solution. 

Why do links rot?
There are many reasons why links might die, which is why this is such a difficult problem to solve. Often people change their site names, causing many of the old links to rot, sometmes people simply allow domain registrations to expire, sometimes people can't afford to keep their domains alive, sometimes businesses have their domains shut down after being acquired by another business, sometimes content "drifts" meaning that links take you to different websites than what was originally intended and sometimes governments and businesses purposefully censor certain publications.  

Currently Internet Archive is busy curating a collection of screenshots of websites to keep them accessible even after their links die, unfortunately their collection is still far from comprehensive and while screenshots do save the content for future use, it does not stop us from encountering dead links on the internet and we won't be able to go cross reference a massive database of screenshots of dead links to see if the one we are looking for happens to be there.

So what can we do to solve this problem? 

[1]https://www.cjr.org/analysis/linkrot-content-drift-new-york-times.php

[2]https://www.nngroup.com/articles/fighting-linkrot/

[3]https://arweave.medium.com/link-rot-the-web-is-decaying-cc7d1c5ad48b

linkrot internet

Creative contributions

Blockchain of the internet + page ID of every historic version

Darko Savic May 25, 2021

What if every page on the internet received its own ID as soon as it was detected by a scanner/bot. And every modification of every page would save it as a newer version. 
Blockchain archive of the internet (minus AI-detected malicious/nonsense/spam pages)
Some huge institutions on par with the library of Alexandria would keep the blockchain copies safe and up to date. This would be funded by nation-states and philanthropy.
If any link went missing, your web browser would poll a local server that pulls the data from the blockchain and shows you the missing version as it was saved at the time the link was created.

Please leave the feedback on this idea

The linkrot (LR) number of as a webpage "up-to-date" metrics

jnikola May 24, 2021

Although it is not going to stop the appearance of "the rotten links", we could maybe use it as a marketing tool to give web pages another metrics to be evaluated by. 

What do you think of a Rotten Link Number or a LinkRot Number (RL/LR) as a metrics that is simply a count of non-working links on the website?

We would build a simple tool that scans for links, opens them, and checks for the validity of the linked object. If the object is there, it scores the link as 1, if it's not, a 0. Then it calculates the percentage of working links compared to all the links on the website and gives a website an overall RL score. 

Why?
It could be a nice metric that tells you about how up-to-date is the website. Websites with many rotten links would mean that the site is not regularly maintained and that you should be careful about the facts found there.
It would force people to edit their rotten links immediately in order to keep a high LR ranking, since it would become a metric considered by searching algorithms.

Please leave the feedback on this idea

Spook Louw4 years ago

That way, pages with broken/dead links will become irrelevant and die themselves. 
I think this is a great idea and a very practical approach. It could be a problem with legitimate/important pages with legitimate/important links that have gone dead (I'm thinking more along the lines of academic research). You don't want to demerit some of those pages just because a couple of links on them have died, but for those, we could perhaps use the screenshot technique. 

Please leave the feedback on this idea

jnikola4 years ago

Spook Louw I agree that it does not solve the problem, but it could force the maintenance service of the important web pages to constantly review and update. It could also encourage startups and new sites with curated, "refurbished" or remastered old content.

Please leave the feedback on this idea

There are archives available, but they don't solve the problem

Spook Louw May 25, 2021

Upon further research, I've been directed to this list of archives that are available and I found Mementos to be a good tool for searching through all the available archives simultaneously. 

However, I think this highlights the scope of the problem more than it solves it and as I have mentioned before, archiving the content of the dead links does not do anything to help with coming across dead links while browsing. 

Wouldn't it be possible to have the dead link take you directly to the archive? That would probably defeat the purpose of domain registrations though. Would it be possible to find a way to remove all links when a domain dies or moves? I'm not sure if they can be traced back that way.

What can we do to prevent linkrot?

Please leave the feedback on this challenge

Necessity

Conciseness

Bounty for the best solution

Currency *

Who gets the Bounty *

Creative contributions

Blockchain of the internet + page ID of every historic version

Please leave the feedback on this idea

The linkrot (LR) number of as a webpage "up-to-date" metrics

Please leave the feedback on this idea

Please leave the feedback on this idea

Please leave the feedback on this idea

There are archives available, but they don't solve the problem

Please leave the feedback on this idea

Add your creative contribution

Sign up or

Guest sign up

General comments

Byjnikola

Can the Internet Save Our Grandparents?

What can we do to prevent linkrot?

Please leave the feedback on this challenge

Necessity

Conciseness

Bounty for the best solution

Currency *

Who gets the Bounty *

Creative contributions

Blockchain of the internet + page ID of every historic version

Please leave the feedback on this idea

The linkrot (LR) number of as a webpage "up-to-date" metrics

Please leave the feedback on this idea

Please leave the feedback on this idea

Please leave the feedback on this idea

There are archives available, but they don't solve the problem

Please leave the feedback on this idea

Add your creative contribution

Sign up or Login

Guest sign up

General comments

Byjnikola

Can the Internet Save Our Grandparents?

Sign up or