Wikipedia:Link rot/URL change requests/Archives/2021/June

From Wikipedia, the free encyclopedia

www.geek.com

I found many broken links on this domain: is it possible to fix them automatically? Jarble (talk) 21:30, 11 March 2021 (UTC)

This is the same situation as observer.com -- in the IABot database the domain is set to Whitelisted thus the bot is not checking/fixing dead links. My bot can try, it's a lot easier than observer as the numbers are small and only requires checking for 404s. -- GreenC 01:51, 12 March 2021 (UTC)

Jarble, finally got everything in place to process this. The IABot database is updated; there are about 800 unique links across 70-some Wikis. They are checked and most are now set to Blacklisted. For enwiki, I issued a job to process articles with these links and it should correctly save. The other wikis will finish in time as the bot comes across them. If you see any problems, let me know. -- GreenC 19:28, 6 June 2021 (UTC)

unc.edu

Thread copied from WP:BOTREQ#Replace_dead_links

Please could someone replace ELs of the form

with

which produces

  • Rowlett, Russ. "Lighthouses of the Bahamas". The Lighthouse Directory. University of North Carolina at Chapel Hill.

Thanks — Martin (MSGJ · talk) 05:38, 19 March 2021 (UTC)

What sort of scale of edits are we talking (tens, hundreds, thousands)? Primefac (talk) 14:37, 19 March 2021 (UTC)
Special:LinkSearch says 1054 for "https://www.unc.edu/~rowlett/lighthouse" and 483 for the "http://" variant. DMacks (talk) 14:43, 19 March 2021 (UTC)
But spot-checking, it's a mix of {{cite web}}, plain links, and links with piped text, and with/without additional plain bibliographic notes. For example, 165 of the https:// form are in a "url=..." context. I think there are too many variations to do automatically. DMacks (talk) 15:06, 19 March 2021 (UTC)

MSGJ, the only type that can be converted are {{cite web}} as noted by User:DMacks it's too messy to determine the square and bare links due to free form text that might be surrounding the URL, unless there is some discernible pattern. There are 334 articles that contain a preceding "url=". Couple questions:

  • Do you know if the content at http://www.ibiblio.org/lighthouse/* is the same as https://www.unc.edu/~rowlett/* as originally cited? ie. what are the chances there has been content drift for these pages.
  • What would you do if the old cite has an |archiveurl= .. delete the archive or leave the cite alone?

-- GreenC 19:19, 19 March 2021 (UTC)

Thanks for looking into this GreenC. I asked at Template talk:Cite rowlett and the working ibiblio.org links almost exactly correspond to the old unc.edu/~rowlett links. I'm not sure what to do ith archive links. Keep them if they are working? The use of {{Cite rowlett}} would be preferable, where possible, but if not, then the bare links can just be replaced. Thanks — Martin (MSGJ · talk) 21:49, 22 March 2021 (UTC)
Are you able to help with this GreenC? Oherwise I will move it back to BOTREQ where it was receiving attention. Thanks — Martin (MSGJ · talk) 07:11, 24 May 2021 (UTC)
MSGJ, I'll do this as the next project, once I finish the current one. Sorry for taking so long. Should be another day or two. -- GreenC 15:06, 24 May 2021 (UTC)

@MSGJ: there are three types of URLs that are causing {{Cite rowlett}} to throw an error:

  • ~rowlett/lighthouse/photos/code.htm
  • ~rowlett/lighthouse/resources/code.htm
  • ~rowlett/lighthouse/types/code.htm

Example diff. Luckily saw and stopped after 5 edits. There are 17 URLs in total which I can provide. Also the error by the template looks like it might be an error, not properly adding it to a tracking category. I'll wait before proceeding in case of more unknown errors that need to track. -- GreenC 00:45, 25 May 2021 (UTC)

Bot results

  • Moved 915 square-link URLs to new site. Example.
  • Converted 582 CS templates to {{cite rowlett}}. Example, Example.
  • Converted 861 square-links to {{cite rowlett}}. Example, Example.
  • Add to template 64 sites. Diff.

Total 1,443 new instances of the template, more than doubling the number in use. Remaining around 200 www.ibiblio.org URLs that might be convertible manually. -- GreenC 00:43, 6 June 2021 (UTC)

Add Archive URLs for Showbuzz Daily refs

Moved from WP:BOTREQ

In WP:TV, we have heavily relied on showbuzzdaily.com cable TV ratings for years but the site is shutting down and might not be available in the near future. We need a bot to add archive urls to all the showbuzzdaily.com citations.

For more on the issue about the site shutting down, there's a discussion here: Wikipedia talk:WikiProject Television#U.S. TV ratings sources. — Starforce13 18:21, 13 June 2021 (UTC)

@Starforce13: will take a look. -- GreenC 18:55, 13 June 2021 (UTC)
Thanks! — Starforce13 19:09, 13 June 2021 (UTC)

For the record, details about the site are in it's Twitter feed: "A sad final update: in addition to our ongoing technical issues, we've lost access to the ratings we'd been able to provide on the website. Therefore, we're sorry to say that the site is officially done." As such, the bot will treat it as a dead site, now, even though pages are still live (if not entirely operational); the site is not being update, and as "officially done" it could go offline intentionally or otherwise any time. If they rebuild and open a new site, we can examine doing URL moves to the new site unwinding the archive URLs, which the bot is capable of. -- GreenC 05:31, 15 June 2021 (UTC)

I see the bot is running. Thank you GreenC for helping with this.— Starforce13 19:11, 16 June 2021 (UTC)

Results

  • Added 24,445 archive URLs in 1,958 articles. Example
  • Changed 3,095 |url-status=live to |url-status=dead in 323 articles. Example
  • Added 233 {{dead link}} in 104 articles. Example
  • Various other fixes.
  • Set 3,150 URLs in the IABot database to Blacklist; added archive URLs; set domain to Blacklist. This will propagate to fixing links in 70+ Wiki sites over time.
  • {{Cite Showbuzz Daily Ratings}} exists. 5 articles use it. Recommend convert them manually to a normal cite web, add archive URLs, and the template nominate for deletion.

@Starforce13: if you see anything else let me know, happy to help. -- GreenC 00:20, 17 June 2021 (UTC)

    • This is fantastic. Thank you so much! — Starforce13 01:40, 17 June 2021 (UTC)
You are welcome. It wasn't many pages, but dense linkage. -- GreenC 03:07, 17 June 2021 (UTC)

Removal of spam links to iamin.in

Moved from WP:BOTREQ

@GreenC: Is your bot capable of removing spam links to iamin.in website? LinkSearch Thanks. -- DaxServer (talk) 10:59, 14 June 2021 (UTC)

@DaxServer:. Yes. The domain has been usurped so archives would be added with |status=usurped; and blacklisted in the IABot database. I'll get to it, thanks for the info. -- GreenC 14:03, 14 June 2021 (UTC)
Thanks! -- DaxServer (talk) 14:40, 14 June 2021 (UTC)
Everything is Blacklisted in the IABot database which will propagate to 70+ language wikis; and set to |url-status=unfit in English wiki (example). It was under 100 URLs in both cases. -- GreenC 04:09, 17 June 2021 (UTC)

Add url-access to The Hindu citations

@GreenC Would you be able to add url-access=limited to the citations from WP:THEHINDU, except those that are tagged dead? I just tested on a private window and it looks like one can read up to 10 articles per month before subscription. -- DaxServer (talk) 15:00, 24 June 2021 (UTC)

That would be easy to do, and this is technically limited access but I am hesitant because of the volume, without more consensus. The questions are the Hindu policy can change any time; and do we want to add this for sites that have such a scheme; and at what level of limited access does it make sense to have it. -- GreenC 16:46, 24 June 2021 (UTC)
I think this is a broader conversation. It would pop-up in the future it becomes important. Let's leave it is they're right now. -- DaxServer (talk) 18:56, 28 June 2021 (UTC)