[librecat-dev] Catmandu and Hadoop/Spark?
guenter.hipler at unibas.ch
Wed Feb 17 12:07:42 CET 2016
it would be nice to integrate Catmandu in such processes. But I think
the integration of a Perl based framework is less natural compared to
e.g. Python. All these "Big Data" components are Java/Scala based and
Perl is not part of the JVM world (might change in the future with
Perl6). Spark and Flink (https://flink.apache.org/) are providing
specialized Python clients.
I know we already have had this discussion more than one year ago ;-)
and for me this was one important reason to use Metafacture for our
project (swissbib). But I still hope both frameworks (Catmandu /
Metafacture) are coming closer together in the future.
Very best wishes from Basel!
On 02/17/2016 10:28 AM, Jakob Voß wrote:
> I just got asked whether Catmandu (or Perl in general) can be used
> with Hadoop or Spark. Has anyone of you tried this before? This is
> what I found for Spark:
> Although we successfully do processing of large data sets with
> Catmandu, I guess it has its limitations with "big data" (whatever
> that means). Maybe it's worth to use Catmandu on top of existing big
> data frameworks such as Hadoop and Spark instead of extending Catmandu
> with big data features such as massive parallel processing?
> Just a thought,
4056 Basel, Schweiz
Tel.: +41 61 267 31 12
Fax: +41 61 267 31 03
E-Mail guenter.hipler at unibas.ch
More information about the librecat-dev