Benutzer:Westbahnhof/Medienliste erstellen mit Stringfunktionen/en

This is a technical note on how to create a list of media used in a wikibook. The book's media files are located on wikicommons (the common way). The procedure here is more of a hack. In the process, things have to be manually checked and edited. It uses w:regular expressions on wikitext.

General workflow: Get the wikitext of your wikibook (or any wiki page). Find the media entries (e.g. pictures). Look up the description page of the media-files (all at once or in chunks of, say, 20). Find the author and licence. Create the table.

Tools: wikibooks (or any wiki project), wikicommons, a text editor with regular expression ability (Libre Office Writer is used here). Regular expressions must be setup to match one line at a time (not matching line breaks), in case-insensitive mode.

Get the wikitext

Bearbeiten

Get the wikitext of the entire book.

  • This can be done by creating a Druckausgabe=print version. It transcludes chapters (w:de:Hilfe:Seiten einbinden) with {{:Book/Introduction}}.. These Tags can be substituted to get the chapter's wikicode like coding {{subst::Book/Introduction}}.. and saving it somewhere.
  • Alternative is to copy and paste the chapters or any parts together.

Copy wikitext to Libre Office Writer (or a more textual regexp-texteditor).

Find the media entries

Bearbeiten

For regular [[File:bla.jpg..]]-pics and regular gallery-pics (both line separated, no <!--comments.jpg--> in between)

search case insensitive, single-line, and → replace (eventually with nothing)

^[^\[]*\[\[\s?(File|Datei):\s*     → <f>
<f>[^|\]]*      → &</f>
^[^<\[]+\.(jpg|svg|png|gif|webm)     → <f>&</f>
^[^<].*$|^<[^f].*$|(?<=<f>)File:|(?<=<f>)Datei:      →
^$     →
<f>|</f>.*$    →
.*      → <f>&</f>\n{{subst:File:&}}

That gives:

<f>Taxus wood.jpg</f>
{{subst:File:Taxus wood.jpg}}
...

Look up the description pages

Bearbeiten

In commons.wikimedia.org editing a page (eg. the pdf description page): paste and Use Show changes (=Änderungen anzeigen) only, to get the results. (They are displayed above the edit window). Using Publish (=Veröffentlichen) would otherwise add hundred licences and categories to the page.

Copy the results from the blue changes output and paste (plain text only) in Writer..

That gives:

<f>Taxus wood.jpg</f>
  	+ 	
== Beschreibung ==
  	+ 	
...

Find the author and licence

Bearbeiten

Search → replace

^\s*\|\s*(author|photographer)\s*=.*$      → <a>&</a>
\{\{(GFDL|GPL|LGPL|cc-by|PD-|ART|C0|Geograph|cc-zero|YouTube cc|FAL|Flickr-no known|Copyrighted free|self\|)[^}]*\}\}  → <l>&</l>

check for 2-line authors (*derivate work..)

search → replace

^.*(?=<l>)|(?<=</l>).*$   →
^[^<].*$|\|\s*(Author|photographer)\s*=|^<[^afl].*$|(?<=<l>)\{\{|\}\}(?=</l>)   →
^$   →
^<f>       → \n<f>

That gives:

<f>Taxus wood.jpg</f>
<a>[[:en:User:MPF|MPF]]<!-- AGF according to enwp source --></a>
<l>GFDL|migration=relicense</l>

..

Create the table

Bearbeiten

More search and → replace

^(<l>.*)\|         → $1·
^(<l>.*)\|         → $1·
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
^(<l>.*)\|         → $1·    
<f>        → |-\n|{{f|
</f>       → ||30|l}}
<a>|<l>    → |
</a>|</l>  →

Now add

{|  class="wikitable"
!File!!Author hint!!Licence hint

and

|} 

at top and bottom respectively. Check for missing or double license or author tags. For example, personal license tags (in the User namespace) are not recognised (they must be subst:-ituted again and are shown in the blue changes table when clicking Show changes). And many other things. Otherwise the table is done and can be inserted in a wiki page.

File Author hint Licence hint
 File:Taxus wood.jpg MPF GFDL·migration=relicense
 File:پیش تنیدگی.Pole.svg *File:پیش تنیدگی.jpg: Nomani mitra *derivate work: User:Westbahnhof self·cc-by-sa-3.0
..

The table only works on wikicommons actually, the above is an edited version of the real example.