[librecat-dev] Working with MARC bibs and holdings
Hemme, Felix
F.Hemme at zbw.eu
Fri Dec 6 11:32:20 CET 2024
Hi, I'm currently processing metadata from the German Union Catalog of Serials (ZDB) that contains MARC bibs and their associated holdings for a given library. It's a classic ETL approach by fetching the data via SRU, running it through a fix file and converting it to TSV. My setup:
A catmandu.yaml file with the content:
importer:
marcxml:
package: MARC
options:
type: XML
zdb:
package: SRU
options:
base: https://services.dnb.de/sru/zdb
recordSchema: MARC21plus-xml
parser: marcxml
limit: 100
A simple fix file rules_marc.fix to test if the conversion is working:
marc_map("001","ppn")
remove_field(record);
remove_field(_id);
A shell script get_marc.sh:
#!/bin/sh
catmandu convert zdb --query "sigel=206 and frm=O and dok=Zeitung" --fix rules_marc.fix to CSV --fields ppn --sep-char '\t' > marc_records_zdb.tsv
Running the script does create an empty TSV file. I assume this is because the XML is nested with an <collection/> element: https://services.dnb.de/sru/zdb?version=1.1&operation=searchRetrieve&query=sigel%3D206+AND+frm%3DO+AND+dok%3DZeitschrift&recordSchema=MARC21plus-xml&maximumRecords=1. How do I access the element and how can I filter for record type="Bibliographic"> or <record type="Holdings">?
When I change the recordSchema to MARC21-xml to only contain MARC bibs (not holdings), a list of PPN's (record identifiers) is created as expected: https://services.dnb.de/sru/zdb?version=1.1&operation=searchRetrieve&query=sigel%3D206+AND+frm%3DO+AND+dok%3DZeitschrift&recordSchema=MARC21-xml&maximumRecords=1.
Best,
Felix
More information about the librecat-dev
mailing list