[librecat-dev] ISBN extraction use case
Spiros Antonio
spiros.antonio at gmail.com
Thu Sep 13 11:08:19 CEST 2018
Hello,
To extract all the ISBN numbers:we use the 'dedup.fix':
marc_map(020a, identifier.$append)
replace_all(identifier.*,"\s+.*","")
do list(path:identifier)
isbn13(.)
end
do hashmap(exporter:YAML)
copy_field(identifier,key)
copy_field(_id,value)
end
$ catmandu convert MARC to Null --fix dedup.fix < marc.mrc > output.yml
Then we use the 'cleanup.fix' script:
select exists(value.1)
join_field(value,",")
$ catmandu convert YAML to TSV --fix cleanup.fix < output.yml > result.csv
This will provide a tab delimited file of double isbn numbers in the MARC
input file.
how could one add more info to this csv, for example something like
duplicateISBN, id1,600a,600b,id2,600a,600b
thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-bielefeld.de/mailman2/unibi/public/librecat-dev/attachments/20180913/cb92fe46/attachment.html>
More information about the librecat-dev
mailing list