User:File Upload Bot (Kernigh)

This is a bot operated by User:Kernigh. It does not have a "bot" flag, thus it will show in recent changes!

This bot runs a modified version of the Commons:File upload service/Script. Kernigh added two hacks to the script: one to upload to MediaWikis other than Commons, and to download images instead of uploading them. This way, it can copy images between wikis.

20 August 2006... I have produced a version of the perl scripts without my username and password for view and download by others.

1 May 2006... This bot copied the Category:NetHack images from Wikimedia Commons, as part of moving the NetHack guide from Wikibooks. The bot waited 2 minutes between uploaded, yet still owned most of Special:Newimages and Special:Recentchanges.

MapleStory situation
See Wikibooks Import List/MapleStory, Community Issues

The scripts
WARNING Someone might have edited these scripts. If you want to run them, download them from http://kernigh.pbwiki.com/DownloadFile#MediaWikibot instead!

mwdown.pl

 * 1) !/usr/bin/perl
 * 2) mwup.pl - MediaWiki file DOWNLOAD script
 * 3)   by Kernigh - xkernigh AT netscape DOT net
 * 4)   version 2006-05-01 - this script is in the public domain
 * 5) Derived from:
 * 6) Upload script by Erik Möller - moeller AT scireview DOT de - public domain
 * 7) Developed for the Wikimedia Commons
 * 8) Note: Before usage, create an account on the source MediaWiki
 * 9) for the bot. On Wikimedia Commons, the convention is
 * 10) "File Upload Bot (Username)", for example, File Upload Bot (Kernigh).
 * 11) Set the username and password below:
 * 1) "File Upload Bot (Username)", for example, File Upload Bot (Kernigh).
 * 2) Set the username and password below:
 * 1) Set the username and password below:

$username = "USERNAME"; $password = "PASSWORD";

$pause = 1;
 * 1) Set the pause in seconds after each download

%wiki_php = ( 'commons',      'http://commons.wikimedia.org/w/index.php',  'wb',           'http://en.wikibooks.org/w/index.php',  'wikibooks',    'http://en.wikibooks.org/w/index.php',  'sw',           'http://strategywiki.net/w/index.php',  'strategywiki', 'http://strategywiki.net/w/index.php', );
 * 1) List the wiki PHP scripts where you use the above username and password


 * 1) Then run the script on the command line using
 * 2) $ perl mwdown.pl wiki dirname < filenames
 * 3) where wiki is one of the wikis from the wiki_php list above,
 * 4) and dirname/ is the name of a directory which will contain the files to
 * 5) be downloaded, and filenames is a list of files, one per line,
 * 6) to download.
 * 7) Don't edit below unless you know what you're doing.
 * 1) to download.
 * 2) Don't edit below unless you know what you're doing.
 * 1) Don't edit below unless you know what you're doing.

use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTTP::Cookies; use HTML::Parser; use Encode qw(encode); use warnings;
 * 1) We need these libraries. They should be part of a standard Perl
 * 2) distribution.

$ignore_login_error=0; $docstring="Please read mwup.pl for documentation.\n"; my $wiki=$ARGV[0] or die "Syntax: perl mwdown.pl wiki directory\n$docstring"; my $dir=$ARGV[1] or die "Syntax: perl mwdown.pl wiki directory\n$docstring";

$cgi = $wiki_php{$wiki} or die "Unknown wiki: $wiki\n$docstring";
 * 1) Find the wiki PHP script

if( $cgi =~ m|http://([^/]+)/| ) { $cgi_host = $1; } else { die "Unable to extract hostname from:\n$cgi\n"; }
 * 1) Find the hostname

$dir=~s|\\|/|gi;
 * 1) Make Unix style path

$sep=$/; $/="/"; chomp($dir); $/=$sep;
 * 1) Remove trailing slashes

my $browser=LWP::UserAgent->new; my @ns_headers = (  'User-Agent' => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20041107 Firefox/1.0',   'Accept' => 'image/gif, image/x-xbitmap, image/jpeg,        image/pjpeg, image/png, */*',   'Accept-Charset' => 'iso-8859-1,*,utf-8',   'Accept-Language' => 'en-US',  );

$browser->cookie_jar( {} );

$response=$browser->post("$cgi?title=Special:Userlogin&action=submitlogin", @ns_headers, Content=>[wpName=>$username,wpPassword=>$password,wpRemember=>"1",wpLoginAttempt=>"Log in"]);

if($response->code!=302 && !$ignore_login_error) { print "We weren't able to login. This could have the following causes:
 * 1) After logging in, we should be redirected to another page.
 * 2) If we aren't, something is wrong.

Solution: Edit upload.pl and change them. Solution: Go to (where?) and get a new version of the script. The wiki has cookie check disabled. Solution: Try setting \$ignore_login_error to 1.
 * The username ($username) or password may be incorrect.
 * The MediaWiki software has been upgraded.
 * You are trying to hack this script for other wikis.

Regardless, we will now try to write the output from the server to $dir/debug.txt....\n\n";       open(DEBUG,">$dir/debug.txt") or die "Could not write file.\n";        print DEBUG $response->as_string;        print "This seems to have worked. Take a look at the file for further information.\n";       close(DEBUG);        exit 1; }

mkdir $dir; open(DESC, ">$dir/files.txt") or die "Could not write file.\n";


 * 1) HTML parser callbacks: the while() loop first attaches the
 * 2) div_handler to the image description page. The div_handler attaches
 * 3) an div_a_handler which downloads the image (the first link after

sub div_a_handler { # tagname,attr,self return if shift ne "a"; my $attrs = shift; return unless $$attrs{"href"};
 * 1) This handles the  by downloading the linked image and
 * 2) terminating the parser.

shift->eof; # terminate all parsing after this element

my $url = $$attrs{"href"}; print "Url: $url\n";

# if url starts with slash, prepend host $url = "http://$cgi_host". $url if( $url =~ m|^/| );

if( $url =~ m|.*/([^/]+)| ) { $filename = $1;

# download image print DESC ">$filename\n"; $browser->get( $url, ":content_file" => "$dir/$filename" ); } else { print "FAILURE: Unable to extract filename\n"; } }

sub div_handler { # tagname,attr,self return if shift ne "div"; my $attrs = shift; return unless $$attrs{"id"}; return if $$attrs{"id"} ne "file";
 * 1) This handles the containing the image by
 * 2) attaching div_a_handler.

# If we have not returned, then we found it! Attach image handler. shift->handler( start => \&div_a_handler, "tagname,attr,self" ); }

sub text_handler { #tagname,self return if shift ne "text";
 * 1) This handles the element of the Special:Export page containing
 * 2) the image description. (Yes, using HTML::Parser on XML.)

$self = shift; # the HTML::Parser

# Attach handler to print the image description. $self->handler( text => sub { print DESC (shift) . "\n" }, "dtext" ); $self->handler( end => sub { shift->eof }, "self" ); }

while(  ) { sleep $pause;
 * 1) This parses all image description pages.

$file = $_; chomp $file; # remove line terminator from each line

next if $file =~ /^\w*#/; # skip comments in input $file =~ s/[_ ]/+/g;     # change _, spaces to +

$response=$browser->get( "$cgi?title=Image:$file" ); push @responses, $response->headers_as_string. "\n";

if( $response->code != 200 ) { print "FAILURE: No img desc page: $file\n"; next; }

print "Downloading: $file\n";

# Parse the image description page: div_handler will # attach img_handler to download the iamge. my $parser = HTML::Parser->new( api_version => 3 ); $parser->handler( start => \&div_handler, "tagname,attr,self" ); $parser->parse( ${$response->content_ref} );

# Fetch the image description. Use Special:Export to get it # (because we might be blocked from the edit screen). $response=$browser->get( "$cgi?title=Special:Export/Image:$file" ); if( $response->code != 200 ) { print "FAILURE: No img desc export: $file\n"; print DESC "FAILURE: Unable to read ". "image description from $wiki.\n"; next; }	$parser = HTML::Parser->new( api_version => 3 ); $parser->handler( start => \&text_handler, "tagname,self" ); $parser->parse( ${$response->content_ref} ); }

print "Everything seems to be OK. Log will be written to $dir/debug.txt.\n"; open(DEBUG,">$dir/debug.txt") or die "Could not write file.\n"; print DEBUG @responses;

mwup.pl

 * 1) !/usr/bin/perl
 * 2) mwup.pl - MediaWiki file UPLOAD script
 * 3)   by Kernigh - xkernigh AT netscape DOT net
 * 4)   version 2006-05-01 - this script is in the public domain
 * 5) Derived from:
 * 6) Upload script by Erik Möller - moeller AT scireview DOT de - public domain
 * 7) Developed for the Wikimedia Commons
 * 8) Note: Before usage, create an account on the destination MediaWiki
 * 9) for the bot. On Wikimedia Commons, the convention is
 * 10) "File Upload Bot (Username)", for example, File Upload Bot (Kernigh).
 * 11) Set the username and password below:
 * 1) "File Upload Bot (Username)", for example, File Upload Bot (Kernigh).
 * 2) Set the username and password below:
 * 1) Set the username and password below:

$username = "USERNAME"; $password = "PASSWORD";

$pause = 120;
 * 1) Set the pause in seconds after each upload

%wiki_php = ( 'commons',      'http://commons.wikimedia.org/w/index.php',  'wb',           'http://en.wikibooks.org/w/index.php',  'wikibooks',    'http://en.wikibooks.org/w/index.php',  'sw',           'http://strategywiki.net/w/index.php',  'strategywiki', 'http://strategywiki.net/w/index.php', );
 * 1) List the wiki PHP scripts where you have the username/password pair


 * 1) Then run the script on the command line using
 * 2) $ perl mwup.pl wiki dirname
 * 3) where wiki is one of the wikis from the wiki_php list above,
 * 4) and dirname/ is the name of a directory containing the files to
 * 5) be uploaded, and a file named files.txt in the following format
 * 6) What you write                Explanation
 * 7) @     This text is appended to every description.
 * 8) °Dog photo by Eloquence       This text is used when no description exists.
 * 9) >Dog01.jpg                    Name of a file in the specified directory.
 * 10) German shepherd dog           Description (can be multi-line).
 * 11) >Dog02.jpg                    File without a description (use default)
 * 12) The "@" and "°" lines are optional, and must be in one line. They can
 * 13) occur multiple times in a single file and are only valid until they
 * 14) are changed. As a consequence, description lines cannot start with "@"
 * 15) or "°".
 * 16) Don't edit below unless you know what you're doing.
 * 1) The "@" and "°" lines are optional, and must be in one line. They can
 * 2) occur multiple times in a single file and are only valid until they
 * 3) are changed. As a consequence, description lines cannot start with "@"
 * 4) or "°".
 * 5) Don't edit below unless you know what you're doing.
 * 1) Don't edit below unless you know what you're doing.

use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTTP::Cookies; use Encode qw(encode); use warnings;
 * 1) We need these libraries. They should be part of a standard Perl
 * 2) distribution.

$ignore_login_error=0; $docstring="Please read mwup.pl for documentation.\n"; my $wiki=$ARGV[0] or die "Syntax: perl mwup.pl wiki directory\n$docstring"; my $dir=$ARGV[1] or die "Syntax: perl mwup.pl wiki directory\n$docstring";

$cgi = $wiki_php{$wiki} or die "Unknown wiki: $wiki\n$docstring";
 * 1) Find the wiki PHP script

$dir=~s|\\|/|gi;
 * 1) Make Unix style path

$sep=$/; $/="/"; chomp($dir); $/=$sep;
 * 1) Remove trailing slashes

open(FILELIST,"<$dir/files.txt") or die "Could not find file list at $dir/files.txt.\n$docstring";
 * 1) Now try to get the list of files

$standard_text[0]=""; $default_text[0]=""; $stx=0; $dtx=0; while() { $line=$_; chomp($line); if($line=~m/^@/) { $line=~s/^@//; $standard_text[$stx]=$line; $stx++; $stw=1; }       elsif($line=~m/^°/) { $line=~s/^°//; $default_text[$dtx]=$line; $dtx++; $dtw=1; }       elsif($line=~m/^>/) { $line=~s/^>//;

# New file, but last one doesn't have a description yet - # add current default. if($currentfile) { # If there's been a change of the default or standard # text, we need to apply the old text to the previous # file, not the new one. $dx= $dtw? $dtx-2 : $dtx -1; $sx= $stw? $stx-2 : $stx -1; if(!$desc_added) { $file{$currentfile}.="\n".$default_text[$dx]; }                       $file{$currentfile}.="\n\n".$standard_text[$sx]; }               # Abort the whole batch if this file doesn't exist. if(!-e "$dir/$line") { die "Could not find $dir/$line. Uploading no files.\n"

}               $currentfile=$line; $desc_added=0; $dtw=0;$stw=0; }else { # If this is a header comment, # we just ignore it. Otherwise # it's a file description. if($currentfile) { $file{$currentfile}.="\n".$line; $desc_added=1; }       } }

if($currentfile) { $dx= $dtw? $dtx-2 : $dtx -1; $sx= $stw? $stx-2 : $stx -1; if(!$desc_added) { $file{$currentfile}.="\n".$default_text[$dx]; }       $file{$currentfile}.="\n\n".$standard_text[$sx]; }
 * 1) Last file needs to be processed, too

my $browser=LWP::UserAgent->new; my @ns_headers = (  'User-Agent' => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7) Gecko/20041107 Firefox/1.0',   'Accept' => 'image/gif, image/x-xbitmap, image/jpeg,        image/pjpeg, image/png, */*',   'Accept-Charset' => 'iso-8859-1,*,utf-8',   'Accept-Language' => 'en-US',  );

$browser->cookie_jar( {} );

$response=$browser->post("$cgi?title=Special:Userlogin&action=submitlogin", @ns_headers, Content=>[wpName=>$username,wpPassword=>$password,wpRemember=>"1",wpLoginAttempt=>"Log in"]);

if($response->code!=302 && !$ignore_login_error) { print "We weren't able to login. This could have the following causes:
 * 1) After logging in, we should be redirected to another page.
 * 2) If we aren't, something is wrong.

Solution: Edit upload.pl and change them. Solution: Go to (where?) and get a new version of the upload script. are uploading to has cookie check disabled. Solution: Try setting \$ignore_login_error to 1.
 * The username ($username) or password may be incorrect.
 * The MediaWiki software has been upgraded.
 * You are trying to hack this script for other wikis. The wiki you

Regardless, we will now try to write the output from the server to $dir/debug.txt....\n\n";       open(DEBUG,">$dir/debug.txt") or die "Could not write file.\n";        print DEBUG $response->as_string;        print "This seems to have worked. Take a look at the file for further information.\n";       close(DEBUG);        exit 1; }

foreach $key(keys(%file)) { sleep $pause; print "Uploading $key to the wiki $wiki. Description:\n"; print $file{$key}."\n". "-" x 75. "\n"; uploadfile: $eckey=encode('utf8',$key); if($eckey ne $key) { symlink("$key","$dir/$eckey"); }       $response=$browser->post("$cgi?title=Special:Upload",        @ns_headers,Content_Type=>'form-data',Content=>        [                wpUploadFile=>["$dir/$eckey"],                wpUploadDescription=>encode('utf8',$file{$key}),                wpUploadAffirm=>"1",                wpUpload=>"Upload file",                wpIgnoreWarning=>"1"        ]); push @responses,$response->as_string; if($response->code!=302 && $response->code!=200) { print "Upload failed! Will try again. Output was:\n"; print $response->as_string; goto uploadfile; } else { print "Uploaded successfully.\n"; } }

print "Everything seems to be OK. Log will be written to $dir/debug.txt.\n"; open(DEBUG,">$dir/debug.txt") or die "Could not write file.\n"; print DEBUG @responses;