Wikipedia talk:Salting is usually a bad idea

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
WikiProject iconEssays Low‑impact
WikiProject iconThis page is within the scope of WikiProject Wikipedia essays, a collaborative effort to organise and monitor the impact of Wikipedia essays. If you would like to participate, please visit the project page, where you can join the discussion. For a listing of essays see the essay directory.
LowThis page has been rated as Low-impact on the project's impact scale.
Note icon
The above rating was automatically assessed using data on pageviews, watchers, and incoming links.

The rule of thumb I use is whether it seems more important to the repeat-creators to add their content at a specific title, or just anywhere they can. If someone's trying to get, say, their company's page on Wikipedia, salting and blacklisting usually works: they're not going to play l33tspeak games with the title like "Micr0søft Inc.", there's a finite limit of title variants they'll try at, and Special:Linksearch is usually pretty good at finding them at the stage between salting the first one or two titles and progressing to a blacklist regex. The other end is comparable to semi-protecting WP:AUTOBIO, which is just insane - every edit reverted from a page like that is one that was immediately seen, that never showed up in mainspace, and that usually didn't turn into a draft that someone had to review and decline and eventually delete. —Cryptic 22:41, 23 August 2022 (UTC)[reply]

Edit filters[edit]

Also I think edit filters work best for LTA, as they can't create but they may try to create the page. Thingofme (talk) 15:03, 28 August 2022 (UTC)[reply]

Titles[edit]

Actually, the titles cannot be longer than 255 bytes, not 256 characters. Given the bytes 00 to 1f (dec. 0 to 31) and 7f (dec. 127) are not valid characters in the titles, this means there are 223 possible bytes used. This means that while there are still very many possibilities, their number is less than the square root of the one said in this essay. Alfa-ketosav (talk) 17:16, 2 April 2024 (UTC)[reply]

Actually:

  • {, }, |, [, ], <, > and # can't be part of a title, meaning only 215 bytes are available.
  • a title's first byte can't be :, space or in the 80–bf range (these are those after the first bytes which also contain the number of bytes that can be used), reducing the available first bytes to 149.
  • F8 to FF is unused in UTF-8 due to compatibility reasons with UTF-16, reducing the number of available bytes to 207 (141).
  • C0 and C1 are unused to prevent longer-than-necessary byte sequences, reducing the different available bytes to 205 (139)
  • The final byte can't be higher than BF, so the last byte can have 150 different values. The penultimate can't be higher than DF, so that can have 181 different values, and the 3rd-to-last can't start with F, so that can have 197 values.
  • The space is treated equivalently to _, reducing these values to 204 (138 for the first, 149 for the last, 180 for the penultimate byte and 196 for the one before that).
  • Finally, the upper- and lowercase letters are treated equally in the start of the title, reducing its number of possible values to 112.

Thus, a better upper limit of the number of possible titles is almost 20 billion times lower than the one above. Alfa-ketosav (talk) 18:24, 3 April 2024 (UTC)[reply]