[librecat-dev] Handling server errors in long running oai-pmh harvests

Patrick Hochstenbach Patrick.Hochstenbach at UGent.be
Tue Dec 6 10:20:31 CET 2016


Hi, there is also a resumptionToken you can provide on the command line, to restart at harvest.

E.g.

 $ catmandu convert —url http://somewhere —resumptionToken=189eDa01

(my email eats up hyphens..alle options need double hyphens).

Be aware that an OAI-PMH server can decide for itself how long a resumptionToken is valid. This trick might not work on all servers.

Patrick

> On 6 Dec 2016, at 09:39, Vitali Peil <vitali.peil at uni-bielefeld.de> wrote:
> 
> Hi Dan Michael,
> 
> you are hitting a wound point here: There exists an issue which adresses the problems you are running into:
> 
> https://github.com/LibreCat/Catmandu-OAI/issues/7
> 
> We welcome contributions! We can discuss strategies to keep this importer stable while running for hours.
> 
> Cheers,
> 
> Vitali
> 
> 
> Am 05.12.2016 um 14:49 schrieb "Dan Michael O. Heggø":
>> Hi,
>> 
>> Do you have any tips on handling intermittent server errors occuring in long running harvests?
>> 
>> For a harvest of mine I got around 800.000 records before hitting an error:
>> 
>> ERROR: http://bibsys-k.alma.exlibrisgroup.com/view/oai/47BIBSYS_UBO/request : all at all@oai_komplett at marc21@3535592670002204 : Server closed connection without sending any data back
>> 
>> Eliminating random server errors doesn't seem realistic, so I'm wondering if Catmandu bails out immediately or does multiple retries, and if this is something that can be configured? If not, is the best way to handle it to write a small Perl script that catches certain exceptions?
>> 
>> Also, is it possible to pass in the continuation token to the OAI importer as a flag or something to resume the harvest from where it stopped?
>> 
>> Thanks alot for your feedback!
>> 
>> Dan Michael
>> _______________________________________________
>> librecat-dev mailing list
>> - send list mails to librecat-dev at lists.uni-bielefeld.de
>> - to unsubscribe or change options, visit https://lists.uni-bielefeld.de/mailman2/cgi/unibi/listinfo/librecat-dev
>> - project website: http://librecat.org/
> 
> -- 
> Vitali Peil
> Office U3-200/E1-144, Tel. +49521-106-4010/6125
> Bielefeld University Library
> 
> _______________________________________________
> librecat-dev mailing list
> - send list mails to librecat-dev at lists.uni-bielefeld.de
> - to unsubscribe or change options, visit https://lists.uni-bielefeld.de/mailman2/cgi/unibi/listinfo/librecat-dev
> - project website: http://librecat.org/

Patrick Hochstenbach - digital architect
University Library Ghent
Sint-Hubertusstraat 8 - 9000 Ghent - Belgium
patrick.hochstenbach at ugent.be
+32 (0)9 264 7980




More information about the librecat-dev mailing list