<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"Times New Roman \(Body CS\)";
panose-1:2 11 6 4 2 2 2 2 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0cm;
margin-right:0cm;
margin-bottom:0cm;
margin-left:36.0pt;
font-size:10.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
mso-ligatures:none;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 2.0cm 70.85pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:1427073455;
mso-list-type:hybrid;
mso-list-template-ids:-1805218188 -98254246 134807555 134807557 134807553 134807555 134807557 134807553 134807555 134807557;}
@list l0:level1
{mso-level-start-at:3;
mso-level-number-format:bullet;
mso-level-text:-;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:21.0pt;
text-indent:-18.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-font-family:"Times New Roman";}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:57.0pt;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:\F0A7 ;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:93.0pt;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:\F0B7 ;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:129.0pt;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:165.0pt;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:\F0A7 ;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:201.0pt;
text-indent:-18.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:\F0B7 ;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:237.0pt;
text-indent:-18.0pt;
font-family:Symbol;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:273.0pt;
text-indent:-18.0pt;
font-family:"Courier New";}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:\F0A7 ;
mso-level-tab-stop:none;
mso-level-number-position:left;
margin-left:309.0pt;
text-indent:-18.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style>
</head>
<body lang="en-BE" link="#0563C1" vlink="#954F72" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal"><span lang="NL" style="font-size:14.0pt;mso-fareast-language:EN-US">Hello Martina,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="NL" style="font-size:14.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">This is a known feature. Catmandu was created to work with very large files in streaming mode. This means that Catmandu doesn’t read the complete input data first to
see what kind of fields are available. The software needs to work with the data at hand.<br>
<br>
The JSON format allows that every record have different fields, and doesn’t care if ‘late-comer’ records have a different field layout. Formats such as TSV and CSV don’t have this feature: from the first record you need to tell the software what fields need
to be made available. Catmandu use two procedures:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<ul style="margin-top:0cm" type="disc">
<li class="MsoListParagraph" style="margin-left:-15.0pt;mso-list:l0 level1 lfo1">
<span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">If no information is provided, the field layout is guessed from the first record it receives.<o:p></o:p></span></li><li class="MsoListParagraph" style="margin-left:-15.0pt;mso-list:l0 level1 lfo1">
<span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">One can provide information to Catmandu which fields one wants to see in the output.<o:p></o:p></span></li></ul>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">For the latter Catmandu has the `--fields` option:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;font-family:"Courier New";mso-fareast-language:EN-US">$ catmandu convert JSON to CSV –-fix my.fix –-fields ‘id,name,title,author’ < data.json<br>
<br>
</span><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">As I am writing this, I see also a `--collect_fields 1` option, that does what you want. But first load all the data into memory.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;font-family:"Courier New";mso-fareast-language:EN-US"><br>
</span><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">BR<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">Patrick<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:14.0pt;font-family:"Courier New";mso-fareast-language:EN-US"><br>
</span><span lang="EN-US" style="font-size:14.0pt;mso-fareast-language:EN-US">PS: Do to outlook issues some hyphens and quotes may be lost in the email</span><span lang="EN-US" style="font-size:14.0pt;font-family:"Courier New";mso-fareast-language:EN-US"> <o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:14.0pt;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<div id="mail-editor-reference-message-container">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span lang="DE" style="font-size:12.0pt;color:black">From:
</span></b><span lang="DE" style="font-size:12.0pt;color:black">librecat-dev-bounces@lists.uni-bielefeld.de <librecat-dev-bounces@lists.uni-bielefeld.de> on behalf of Siebert, Dr. Martina <Martina.Siebert@sbb.spk-berlin.de><br>
<b>Date: </b>Friday, 3 November 2023 at 15:29<br>
<b>To: </b>librecat-dev@lists.uni-bielefeld.de <librecat-dev@lists.uni-bielefeld.de><br>
<b>Subject: </b>[librecat-dev] Catmandu: field not converted to TSV/CSV output if not present in first (how many?) input records<o:p></o:p></span></p>
</div>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt">Hello,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt"> <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">It seems not possible to produce a TSV/CSV output that includes fields that appear for the first time only much later in an input file. In the JSON export all is fine, but when exporting to TSV/CSV
the “late-comer” fields are missing. When I fake-add the field to the first record the TSV/CSV export is correct.</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt">Is this a known bug? Can it be fixed?</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt"> </span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt">Martina<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:8.0pt;font-family:"Arial",sans-serif">_____________________________________________</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt">Dr. Martina Siebert</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt">Ostasienabteilung | CrossAsia</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt">Staatsbibliothek zu Berlin – Preußischer Kulturbesitz</span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt"> </span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt"><a href="mailto:martina.siebert@sbb.spk-berlin.de">martina.siebert@sbb.spk-berlin.de</a></span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt"><a href="http://www.staatsbibliothek-berlin.de/">www.staatsbibliothek-berlin.de</a></span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt"> </span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:9.0pt">Im Rahmen der E-Mail-Kommunikation werden gegebenenfalls personenbezogene Daten verarbeitet.
<br>
Unsere Hinweise zum Datenschutz finden Sie hier: <a href="http://sbb.berlin/datenschutz" target="_short" title="Kurz-URL">
http://sbb.berlin/datenschutz</a></span><span lang="DE" style="font-size:11.0pt"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="DE" style="font-size:11.0pt"> <o:p></o:p></span></p>
</div>
</div>
</div>
</body>
</html>