Wikipedia:Link rot/URL change requests/Archives/2023/January

From Wikipedia, the free encyclopedia

Mark Catholic News Service links added before 30 December 2022 as dead and add archive

Moved from WP:BOTREQ
 – * Pppery * it has begun... 01:08, 3 January 2023 (UTC)

The website Catholic News Service is dead since 30 December 2022, due do a decision of the USCCB taken months priors. See this statement. A new website, OSV News, currently has Catholic News Service's former URLs, but it all the previous articles do not seem to be present on OSV News (example). All the links to previously published CNS articles currently link to 404 errorson the OSV News website.

Therefore, my bot request is as follow: all Catholic News Service links added prior to 30 December 2022 should be marked as dead and archive URLs be added.

A discussion on the WProject Christianity seems to support my request. Veverve (talk) 00:18, 3 January 2023 (UTC)

@Veverve: ok, it is done. - GreenC 15:47, 3 January 2023 (UTC)
@GreenC: Sometimes, the URL was left as "live" (e.g. here). All those URLs should be put to "dead".
Also, why did you choose Archive.today and not Archive.org? Or did you choose both with one being preferred? Veverve (talk) 18:50, 3 January 2023 (UTC)
The "left live" appears to be a bug related to the combo of empty archive-* fields and filled url-status field, thanks for identifying. It's been fixed and re-ran on pages eg. [1]. It uses archive.today when it can't find a Wayback URL. -- GreenC 01:34, 4 January 2023 (UTC)
@GreenC: for example, the very first link of Dicastery for Evangelization is archived on Archive.org, but the archive URL which was added was that of Archive.today. Veverve (talk) 03:26, 4 January 2023 (UTC)
This happened because the API result from Wayback is what I call "bogus" meaning it's unreliable. This is because the Wayback API timed out and reported 0 results (it reports 0 not "times out" so you don't know which is what) so it used other techniques and it found the archive but it wasn't sure how reliable it was so it defaulted back to an alternative provider where one existed. -- GreenC 03:42, 4 January 2023 (UTC)

Greenhivesaudio

It seems the site is dead, but it says 403 instead of 404. Like this. So I guess they are better replaced or tagged as dead links. Kailash29792 (talk) 11:35, 11 January 2023 (UTC)

User:Kailash29792, from what I can tell the domain only exists on 7 pages, can you do it manually would be easier and probably more accurate.
GreenC 18:25, 11 January 2023 (UTC)
All done. This could be archived. Kailash29792 (talk) 12:52, 13 January 2023 (UTC)

http 301 domains

  • faunaeur.org -> fauna-eu.org
  • dfw.cbslocal.com -> cbsnews.com/dfw/
  • house.state.tx.us -> house.texas.gov

thank u. <_> jindam, vani (talk) 17:07, 11 January 2023 (UTC)

jindam, vani: they are done. -- GreenC 15:47, 14 January 2023 (UTC)

dspace.usc.es -> minerva.usc.es, hankwilliamsdiscography.com -> jazzdiscography.com, asia.eurosport.com -> eurosport.com, cstv.com -> cbssports.com

(1) dspace.usc.es -> minerva.usc.es, (insource) 16 links (2) hankwilliamsdiscography.com -> jazzdiscography.com (insource) 67 links, (3) asia.eurosport.com -> eurosport.com, Changes sub-domain and redirect to / for links. (insource) 231 links (4) cstv.com -> cbssports.com Changes sub-domain and redirect to / for links. (insource) 618 links <_>jindam, vani (talk) 16:26, 12 January 2023 (UTC)

These are done. -- GreenC 15:22, 15 January 2023 (UTC)

replace dead BYU Library findingaid links with the ones they redirect to

Is it possible to make a bot that replaces old links with new links that the old links redirect to?

The old BYU Library findingaids (pages that explain the contents of a collection within a special collections) are through the URL https://findingaid.lib.byu.edu. Right now, they redirect to their parallel URL on our new archivesspace pages in most cases. I would like to change the URLs, because one of my colleagues informed me that the old URLs may not redirect indefinitely. I have done a few hundred manually but there are still some 600 left. Occasionally, there is an error in the redirect (for example, with item-level things from the folklore collection) and human intervention is useful. If you can limit the bot to change links in the external links section, I can manually change links that are used a references (to ensure that the same information is present). Rachel Helps (BYU) (talk) 20:46, 3 January 2023 (UTC)

User:Rachel Helps (BYU), hi I can do this but can't limit based on where the URL is on the page. Do you have an example of a redirect error I can learn from? -- GreenC 18:21, 11 January 2023 (UTC)
That would be amazing--I can cleanup links in refs. Here is an example that works: on Mary_Elizabeth_Rollins_Lightner#External_links, https://findingaid.lib.byu.edu/viewItem/Vault%20MSS%20363 redirects to http://archives.lib.byu.edu/repositories/14/resources/7316. Most of the items in the folklore collection will have errors though, because item-level cataloging for this collection was removed (they have FA in the URL). One example I've preserved in the reference authored by Kristi Young on Alice Louise Reynolds. https://findingaid.lib.byu.edu/viewItem/FA%205/4.18.9.1.1/ redirects to the error page http://archives.lib.byu.edu/repositories/14/archival_objects/76703 (for this particular source, the original was preserved on archive.org https://web.archive.org/web/20150107211032/https://findingaid.lib.byu.edu/viewItem/FA%205/4.18.9.1.1/). I've already fixed most of the FAs; the remaining ones, #s 4-10 on the special link search, are still there because they're sources, not just external links. Another error example I found was https://findingaid.lib.byu.edu/viewItem/UA%201020/Series%204/box%202/folder%204/ on LDS Hospital (probably another item-level item that was eliminated with the transition to archivesspace). Looking for more error examples on the last 500 items in the special link search, I think that most "Series"-level links will have errors. Rachel Helps (BYU) (talk) 18:47, 11 January 2023 (UTC)
User:Rachel Helps (BYU) this should be no problem, the error page returns a 404 code which the bot detects and will attempt to find an archive URL to replace with. I'll run 50 articles or so and you can check to make sure it's on the right track. I need to process about.com (below) first, this is an urgent job involving a usurped domain, then will return here in a few days, thanks for your patience. -- GreenC 03:22, 12 January 2023 (UTC)

User:Rachel Helps (BYU), ok got it faster than expected. The bot made 50 edits can you look and report any problems? It starts at Tracy Hall at the top through to Charles L. Walker. If it looks OK I'll do the rest.-- GreenC 19:09, 12 January 2023 (UTC)

I went ahead and finished it. If you see any problems, let me know. -- GreenC 15:20, 15 January 2023 (UTC)
Thank you so much!! You have lifted a weight from my worklist in a very efficient way. Rachel Helps (BYU) (talk) 19:33, 16 January 2023 (UTC)
So glad to hear that, anytime you need help I am here. -- GreenC 22:37, 16 January 2023 (UTC)

industry.bnet.com -> cbsnews.com/moneywatch/, people.africadatabase.org -> hotels-dubai.org/africadatabaseorg/, examiner.ie -> irishexaminer.com, apnewsarchive.com -> apnews.com, xwebapp.ustrotting.com -> ustrottingnews.com, reader.digitale-sammlungen.de -> digitale-sammlungen.de, somali.asso.fr -> biotaxis.fr

(1) industry.bnet.com -> cbsnews.com/moneywatch/; (insource) 25 links, (2) people.africadatabase.org -> hotels-dubai.org/africadatabaseorg/ (insource) 57 links, (3) examiner.ie -> irishexaminer.com; (insource) 314 links, (4) Changes scheme from http to https and changes domain and changes path: apnewsarchive.com -> apnews.com; (insource) 1477 links, (5) Changes domain and redirect to /: xwebapp.ustrotting.com -> ustrottingnews.com; (insource) 52 links, (6) Changes sub-domain and redirects to crufty url: reader.digitale-sammlungen.de -> digitale-sammlungen.de; (insource) 670 links, (7) Changes domain and redirect to /: somali.asso.fr -> biotaxis.fr; (insource) 236 links. thank u. <_>jindam, vani (talk) 08:00, 13 January 2023 (UTC)

All done. -- GreenC 01:56, 18 January 2023 (UTC)

It seems the site is down since they failed to renew their license. All links must be tagged as url-status=dead and others have archives added. Kailash29792 (talk) 08:36, 20 January 2023 (UTC)

Done. It updated 777 links. -- GreenC 22:00, 27 January 2023 (UTC)