[librecat-dev] ISBN extraction use case

Spiros Antonio spiros.antonio at gmail.com
Thu Sep 13 11:08:19 CEST 2018


Hello,

To extract all the ISBN numbers:we use the 'dedup.fix':

marc_map(020a, identifier.$append)

replace_all(identifier.*,"\s+.*","")

do list(path:identifier)

  isbn13(.)

end

do hashmap(exporter:YAML)

  copy_field(identifier,key)

  copy_field(_id,value)

end

$ catmandu convert MARC to Null --fix dedup.fix < marc.mrc > output.yml

Then we use the 'cleanup.fix' script:

select exists(value.1)

join_field(value,",")

$ catmandu convert YAML to TSV --fix cleanup.fix < output.yml > result.csv

This will provide a tab delimited file of double isbn numbers in the MARC
input file.

how could one add more info to this csv, for example something like

duplicateISBN, id1,600a,600b,id2,600a,600b

thank you
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.uni-bielefeld.de/mailman2/unibi/public/librecat-dev/attachments/20180913/cb92fe46/attachment.html>


More information about the librecat-dev mailing list