Help:Dead Links

From Fanlore
(Redirected from Help:Linkrot)
Jump to navigation Jump to search

Dead Links occur when a link to an external page no longer contains the content it references. Often, this is because the site no longer exists; sometimes the content itself has been removed, the URL has changed, or the content has been locked so that casual readers cannot see it. This is also known as "link rot."

According to Wikipedia:

"Like most large websites, Wikipedia suffers from the phenomenon known as link rot, where external links, often used as references and citations, gradually become irrelevant or broken, as the linked websites disappear, change their content, or move. This presents a significant threat to Wikipedia's reliability policy and its source citation guideline. The effort required to prevent link rot is significantly less than the effort required to repair or mitigate a rotten link. Therefore, prevention of link rot strengthens the encyclopedia. This guide provides strategies for preventing link rot before it happens. These include the use of web archiving services and the judicious use of citation templates."[1]

Fanlore policy on link rot is to find an updated page with the correct content if possible, but if not, to leave the original link for archive purposes and flag it with the {{Dead link}} template.

If you find a dead link, you can run the URL through one of two archiving services to see if the page was archived and then update the dead link. In addition, Fanlore contributors have several tools they can use to proactively prevent dead links. We recommend that these tools be used whenever a new cite is created.

Tools

Web archiving services come and go. Use the Wikipedia page to learn about the latest services. Some editors on Fanlore use two services - they are arranged in order of preference.

  • The WayBack Machine (otherwise known as the Internet Archive). This is the preferred archive tool - it is well established, run by a non-profit and honors robot.txt exclusions. It will not bypass websites with adult content warnings. It will sometimes have difficulty unfolding collapsed comments or replies (ex: Dreamwidth or Livejournal).
  • Archive.is - this is a privately funded service with a much shorter track record. It is, however, the more robust cite tool as it can often bypass adult content warnings. It is also the best tool to use for citing Livejournal and Dreamwidth posts as it automatically unfolds comments. And it seems to have fewer problems with tumblr blogs and other platforms. However, as a single person funded operation with a short history, it is unreliable and a second backup archive link should always be made when using Archive.is.

Because these services can often retroactively remove archive links, consider using two of the services for key or important links.

Some editors also used WebCite in the past but this service no longer accepts new submissions so it cannot be used to archive new pages.

Usage Tips

Both web archiving services are demand-driven - and only the Wayback Machine "crawls" websites to automatically archive copies. Therefore to prevent link rot, the editor must run the website URL through the service themselves (ex: on the WBM, look in the lower right hand corner of the page for "Save Now" box). The good news is that both services also offer bookmarklets or other web browser buttons in addition to their website interfaces, and this can make the process easier.

Both tools allow you to search for an existing archived copy of a website. Both the WBM and Archive.is button will first attempt to locate an existing copy - you can then decide if you need an updated version if the page has changed. The Firefox button provided for Archive.is is notably more powerful than the other Archive.is plugins. It allows for saving and searching across many different archives, though Archive.is is the default.

Check Your Archived Links

Always check your archived links before adding them to Fanlore to make certain you have successfully captured the page. Ex: if you are citing the page for an image file, make certain the image appears on the archived version. If you are citing text, make certain the page has not been excluded from the WBM or only partially captured.

Citing FanArt

Consider adding low resolution examples of fan art instead of only linking. If you don't want to include the fan art on Fanlore, then you should create an archive link. Since most fan art images are stored separately from the website, backing up the fan art may require two separate archive links depending on the archive service you are using. For example, Archive.is creates a single snapshot of the entire page - including all images. You only need to create one archive link and you are done. The WBM, however, may be more hit and miss, especially on sites like DeviantArt, Instagram, Imgur, ImageShack, Flickr or Tumblr that store theie image files in a separate location. In such cases, you may need to use a two-step process to archive a picture by creating 1) the archive link to the main page (Example) and then creating an 2) archive link to the image (Example). Once you know that the image file has been properly archived, all you need to do is add the main archive link to Fanlore - it will then include all the necessary files.

Handling Problems

  • Excluded from the WBM? The WayBack Machine will not archive websites that have been excluded from "crawling" or indexing. However, many older fannish websites have been registered by other owners or bulk resellers who have automatically removed websites from the WBM - even if that was never the intent of the original owners.[2] If a website has been excluded, consider using Archive.is to create a stable archive link. Alternatively, consider quoting the material on the Fanlore page in more detail or creating a screencap of the page and uploading it to Fanlore. Keep in mind Fanlore's Fair Use policy when doing so.
  • Adult Content Warning? Both archiving services have problems with Adult Content Warnings (websites that pop up an "adult content" notice). Archive.is does have the ability to bypass some popup warnings, but you should check the archived page to verify success before adding it to Fanlore. Consider following the suggestions outlined above for pages excluded from the WBM for other methods of dealing with pages behind Adult Content Warnings.
  • Comments not unthreading? The WBM is hit and miss when citing comments in a threaded post such as Livejournal or Dreamwidth. Archive.is will unfold threaded posts on some platforms. Consider using the "permalink" option and link to just the one comment. However, since threaded comments often contain only part of a dialogue, look at the suggestions outlined above for pages excluded from the WBM.
  • Note about images/pictures on a website: Archive.is is best at capturing all images and text on a cited page. The WayBack Machine sometimes fails to capture images - see the Citing Fanart section for how to fix this.
  • PDFs or other files? The WBM can archive files such as PDFs, Word .doc, or linked audio or video files. Simply plug in the file URL into the WBM "archive" box, and open or save the file to your desktop. This runs the file through the archiving service and saves it. Files that are merely linked to on a website will not be automatically archived. Archive.is cannot archive any PDFs, Word .doc, or audio or video files. As of 2015, none of the archiving services can archive streaming web or audio footage like the type that is found on Youtube.
  • What to do about websites that are already dead? If a website has been cited on Fanlore and is offline and no longer accessible, check both the archiving services to see if the page has been archived. Then replace the dead link with the archived link and add "(offline, archived)" at the end of the cite. If the page is offline or locked and has not been archived, leave the cite intact and add "(offline, not archived)" after the cite. Do not delete dead links that are not archived- just document their current status. Consider looking for alternative sources to use that are current and/or can be found in archives.

Archive Citation Template

Archive links should be added after the main citation using parentheses. E.g.: Uispeccoll: Here’s a media Fanzine Friday by request (accessed December 26, 2015) can be listed as: Uispeccoll:Here’s a media Fanzine Friday by request (accessed December 26, 2015) (archive link).

Fanlore has an automated citation bookmarklet that will run any page through the WBM and, if the page has been archived, create a self-contained citation template to plug into Fanlore. Besides automating both the citation and archiving of links, the citation tool will allow for easier bulk editing down the road. Info about the javascript citation tool has been posted to the Fanlore Dreamwidth Community., Archived version

Related Help Pages

References