Chained Lighthouse

On a recent trip to Ireland, I played around with perspective and depth of field to capture a light house in a link of chain:




I wonder if it would be better as a flatter image, with the chain in focus too…

Cleaning up a Mediawiki spam mess

I run a wiki for CURATEcamp, using Mediawiki.  I don’t run it well, so it got full of spam.  I learned how to add a little math script to each page edit, and that slowed down the spam for a while, but it’s easy to hack and the spam started flowing again.  So now I have 700+ pages of spam and more coming in every day.  So I have 3 problems to solve:

  1. Stop the addition of new users without confirmation
  2. Stop new spam
  3. Clean up all the spam pages
I found the ConfirmAccount extension and installed it.  That fixed #1.

Next, I found the page Preventing access and followed the instructions to add these lines to the LocalSettings.php file:

# Disable anonymous editing
$wgGroupPermissions['*']['edit'] = false;

That stopped the random adding of new spam.


Next, I started looking for easy clean up tools, and didn’t really find any.  I could list all of the pages on the wiki, but I’d have to visit each one and delete it – a real pain for 700+ pages.  I also had about 20 pages that I wanted to keep.  I found a DeleteBatch extension that would allow me to put the spam page names into a text box (or text file) and delete them all at once.

Now I needed to generate a list of spam page names, so I went to the Special Page that lists All pages, and cut and pasted those into an Excel spreadsheet.  It was a bit of a pain because the list was in three columns, and split into three pages, but I just dragged and dropped the list around in Excel until I had it all as one column.  Most of the spam pages are user pages, and the titles of the pages end in a number.  So I set up a second column that chopped the last 2 characters from the page title:


then had a third column which was a conditional that repeated the page title if it ended in a number.  I bet I could have made it simpler with some function that converts a cell made up of a word and a number, like “ClardyGarces959” into just “959” but I couldn’t remember how to do that.


Next, I sorted by this column, which grouped all of the page titles that ended in a number.  I visually inspected the list, and I’m glad I did because some of my legitimate pages also ended in numbers.  I deleted those from the list, then pasted the list of known spam page titles into DeleteBatch.

This left me with a handful of spam pages that I had to pick through individually, but way fewer than before.

Hope this helps someone else with the same problem!


Make sure to look for pages in spaces other than Main.  I found a bunch more User: pages full of spam, and uses the same methods as above to quickly get rid of them.

