[librecat-dev] [EXT] Working with MARC bibs and holdings

Rolschewski, Johann Johann.Rolschewski at sbb.spk-berlin.de
Mon Dec 9 12:50:20 CET 2024


Hi Felix,

> Running the script does create an empty TSV file. I assume this is because the
> XML is nested with an <collection/> element:
> https://services.dnb.de/sru/zdb?version=1.1&operation=searchRetrieve&q
> uery=sigel%3D206+AND+frm%3DO+AND+dok%3DZeitschrift&recordSchema
> =MARC21plus-xml&maximumRecords=1. How do I access the element and
> how can I filter for record type="Bibliographic"> or <record
> type="Holdings">?
> 
> When I change the recordSchema to MARC21-xml to only contain MARC bibs
> (not holdings), a list of PPN's (record identifiers) is created as expected:
> https://services.dnb.de/sru/zdb?version=1.1&operation=searchRetrieve&q
> uery=sigel%3D206+AND+frm%3DO+AND+dok%3DZeitschrift&recordSchema
> =MARC21-xml&maximumRecords=1.

There are two problems: 

+ Catmandu::Importer::SRU can't handle a MARC record within a <collection> tag 

+ the record schema "MARC21plus-xml" returns a list of bibliographic and holding records for each match and Catmandu just can handle and fix a single record for each match

The SRU standard defines that the <record> element in the result set should "contains a record". "MARC21plus-xml" was developed for a special use case  and is based on a "loose" interpretation of the standard, defining a MARC <collection> as a "record". I don't see how a solution for use case could be implemented in Catmandu in a sane and easy way...

The ZDB offers another record schema "MARC21plus-1-xml", which contains some holding information in 924 fields within the bibliographic MARC record. These could be handled by Catmandu:

$ catmandu convert -v SRU \
--base http://services.dnb.de/sru/zdb \
--recordSchema MARC21plus-1-xml \
--parser marcxml \
--query 'tit=soil biology' \
--total 10 \
--fix 'marc_map(924$a,eid,join:"; ");remove_field(record)' \
to TSV --fields '_id,eid'

Best

Johann



More information about the librecat-dev mailing list