Mar 07 2012
Archive for March, 2012
Mar 07 2012
Chained Lighthouse
On a recent trip to Ireland, I played around with perspective and depth of field to capture a light house in a link of chain:
I wonder if it would be better as a flatter image, with the chain in focus too…
Mar 04 2012
Cleaning up a Mediawiki spam mess
I run a wiki for CURATEcamp, using Mediawiki. I don’t run it well, so it got full of spam. I learned how to add a little math script to each page edit, and that slowed down the spam for a while, but it’s easy to hack and the spam started flowing again. So now I have 700+ pages of spam and more coming in every day. So I have 3 problems to solve:
- Stop the addition of new users without confirmation
- Stop new spam
- Clean up all the spam pages
Next, I found the page Preventing access and followed the instructions to add these lines to the LocalSettings.php file:
# Disable anonymous editing $wgGroupPermissions['*']['edit'] = false;
That stopped the random adding of new spam.
Next, I started looking for easy clean up tools, and didn’t really find any. I could list all of the pages on the wiki, but I’d have to visit each one and delete it – a real pain for 700+ pages. I also had about 20 pages that I wanted to keep. I found a DeleteBatch extension that would allow me to put the spam page names into a text box (or text file) and delete them all at once.
Now I needed to generate a list of spam page names, so I went to the Special Page that lists All pages, and cut and pasted those into an Excel spreadsheet. It was a bit of a pain because the list was in three columns, and split into three pages, but I just dragged and dropped the list around in Excel until I had it all as one column. Most of the spam pages are user pages, and the titles of the pages end in a number. So I set up a second column that chopped the last 2 characters from the page title:
=VALUE(RIGHT(A115,2))
then had a third column which was a conditional that repeated the page title if it ended in a number. I bet I could have made it simpler with some function that converts a cell made up of a word and a number, like “ClardyGarces959” into just “959” but I couldn’t remember how to do that.
=IF((ISNUMBER(B115)),A115)
Next, I sorted by this column, which grouped all of the page titles that ended in a number. I visually inspected the list, and I’m glad I did because some of my legitimate pages also ended in numbers. I deleted those from the list, then pasted the list of known spam page titles into DeleteBatch.
This left me with a handful of spam pages that I had to pick through individually, but way fewer than before.
Hope this helps someone else with the same problem!
UPDATE
Make sure to look for pages in spaces other than Main. I found a bunch more User: pages full of spam, and uses the same methods as above to quickly get rid of them.