Wikipedia:Link rot/URL change requests/Archives/2020/March

From Wikipedia, the free encyclopedia

500 obsolete usda links

We have nearly 500 links to ndb.nal.usda.gov/ndb/[1] which now return this redirect:

"As of October 1, 2019, this website (https://ndb.nal.usda.gov/ndb/) will no longer be available and users will be automatically redirected to FoodData Central..."

Picking two at random, this URL...

http://ndb.nal.usda.gov/ndb/foods/show/2950

...brings you to this page:

https://fdc.nal.usda.gov/fdc-app.html#/?query=ndbNumber:11197

and this URL...

http://ndb.nal.usda.gov/ndb/foods/show/105?fg=&man=&lfacet=&format=&count=&max=25&offset=25&sort=&qlookup=yogurt

...brings you to this page:

https://fdc.nal.usda.gov/fdc-app.html#/?query=ndbNumber:1116

Would this be a good candidate for an automated fix, or does someone have to manually fix all 500 before the original URL disappears? --Guy Macon (talk) 05:49, 6 January 2020 (UTC)

It doesn't seem like there is immediate danger of the redirects disappearing. They assure us the redirects will be in place as of Oct. 1 which is the case. But I agree it's a good idea to move URLs while redirects still exist. I'm way behind on projects, will keep this one in the queue, but it might be a while before I can program it. Most of the time, cases like this are more complex then they appear (some URLs don't have redirects, some do but lead to 404 pages etc..). -- GreenC 04:15, 7 January 2020 (UTC)
The number in the URL data is just a sequential identifier. For low numbers, it corresponds to the ndbNumber, but that number starts skipping values. That means we'd have to do some sort of lookup, and updating based on the redirects that are in place is probably easiest. --AntiCompositeNumber (talk) 04:18, 3 March 2020 (UTC)

NASA Image and Video Library

NASA's image library moved, see this change for an example of what can be potentially mass fixed. Likely mostly a Commons issue, might be some here.

There is a change GRIN links can be changed to the NASA Image and Video Library as well (see this change or this change), but those can get weird since there are sometimes multiple IDs. Would have to check to see if the GRIN ID number matches a non-404'ed URL for images.nasa.gov. Bonus points if the NASA-image template on the image page e.g. {{NASA-image|id=GPN-2000-001167|alternateid=S70-36485|center=JSC}} to try multiple possible ID's for working links.

The first case seems simple, the second case is more difficult but still possible I think. Likely a lot more of the second issue than the first, but not sure without having a tool to check. Kees08 (Talk) 21:54, 22 December 2019 (UTC)

@GreenC: Any thoughts on this? Kees08 (Talk) 16:32, 12 February 2020 (UTC)
  • @Kees08: Thank you for your patience as I work through these one at a time (I should be going oldest to newest but somehow went the other direction), it's a lot of work to program the bot for each job. On Enwiki the first task is small enough it might be done manually by someone. On Commons there are 300 or so, I could probably make a quick search-transform-replace script that would get most of them. However on commons when modifying an image "Source: " URL some people complain the new link may or may not be the original source image where the Commons image came from. I can see the point though at the same time maintaining a dead link for the source doesn't seem very useful, unless an archive URL can be found. The GRIN idea not sure I understand. How to determine http://grin.hq.nasa.gov/ABSTRACTS/GPN-2000-001167.html equates to https://images.nasa.gov/details-S70-36485 -- GreenC 21:50, 13 February 2020 (UTC)
    No worries on the timing, there is no rush on this, was just seeing if there was a technical reason you had not responded to it. No problem! In the case of GRIN, in this specific case, the workflow would be something like:
    1. Detect dead GRIN link
    2. Find ID numbers:
      1. http://grin.hq.nasa.gov/ABSTRACTS/GPN-2000-001167.html
      2. http://dayton.hq.nasa.gov/IMAGES/LARGE/GPN-2000-001167.jpg
      3. {{NASA-image|id=GPN-2000-001167|alternateid=S70-36485|center=JSC}}
    3. Use NASA Image Library API to determine if either ID number returns a valid page
    4. Replace GRIN link with NASA Image Library link
    Does that workflow make a little more sense? I would guess there are many dead GRIN links as that used to be NASA's main library before they moved it.
    On the replace text note where people complain about removing the dead link, perhaps you could add a dead link template to the dead one, and add the live link separately? Although I personally, in these cases, would prefer to just remove the dead link. Kees08 (Talk) 23:47, 13 February 2020 (UTC)
Ok wasn't aware of an API. But tried without luck. For example it works using the new NASA ID https://images-api.nasa.gov/search?q=S70-36485 (taken from the above details-S70-36485) but a search for a GRIN ID https://images-api.nasa.gov/search?q=2000-001167 (and variants) does not. Wonder if the API is aware of GRIN IDs? -- GreenC 03:04, 14 February 2020 (UTC)
Bummer. I sent an inquiry to NASA to see if the GRIN IDs are mapped to the NASA center specific IDs. Will let you know if I hear anything back. Kees08 (Talk) 02:01, 22 February 2020 (UTC)
@GreenC: Here is the response back I received: GRIN was created and maintained by HQ History Office but they since exported most of the imagery over to Flickr under “NASA Commons” They don’t use the GRIN numbers at all anymore. We only have 25 images in our database that cross reference a GPN #. I know if you put the GPN number in archive.org website (NON government site) it comes up with the image. Sometimes it will have a NASA center image ID too. https://archive.org/details/GPN-2000-001167
So sounds like we are out of luck when it comes to mapping the ID numbers. Do you have any idea how many links could be fixed with case 3 above, where we have a non-GRIN ID (such as S70-36485)? That would still be majorly helpful. Kees08 (Talk) 20:12, 25 February 2020 (UTC)
I'm not sure sorry. This is a complex task and I'm not sure when/if I will be able to do it. I don't think the numbers are very large, you could refactor the request given what we learned and try Village Pump Technical, there are some good programmers who might take it up. -- GreenC 16:19, 1 March 2020 (UTC)
No worries, I will try that, thanks. Kees08 (Talk) 17:33, 13 March 2020 (UTC)

Spaceflight101 may die

Sorry to copy the title of the previous section :). At an unknown date Spaceflight101.com will go away. The notice on the homepage says it is paid through to the end of 2019. Though the Twitter says On Hiatus. Revival planned ... late 2020. It may be prudent to archive all of the Spaceflight101.com links just in case the revival does not happen. If you think it is a waste of time we can just see what happens, not a big deal. Kees08 (Talk) 17:40, 13 March 2020 (UTC)

IABot and/or nomore404 should have automatically archived every link by now, a check of a few shows so. There are a couple hundred total. -- GreenC 17:52, 13 March 2020 (UTC)

ECOSecretariat.org usurped

ECOSecretariat.org has been usurped by an unrelated website. There were 24 cites/ELs that I recovered and marked, and one that I was unable to recover and marked dead. —[AlanM1 (talk)]— 21:24, 18 March 2020 (UTC)

I blacklisted it in the IABot interface. -- GreenC 20:51, 19 March 2020 (UTC)