About EasyMorph Tutorials & Examples Web-help

Best Practice for loading a huge file


#1

Hi,
In a transformation, I have to import a very big csv file (more than 5M rows) that changes only once a month, and many other small files (100 lines max) that change frequently.
Each time, I have to wait the big csv file to be imported.
What would you recommend to avoid or limit this time consuming task (except at the beginning of each month). Should I export the csv in a MySQL database? Other solution?
Best,
Michel


#2

Hi Michel,

Exporting to a database won’t help because it would be similar performance. Load the CSV file in EasyMorph and export it into a Qlik QVD file. This is a compressed data format and EasyMorph reads it faster than regular text files.


#3

Thank you, Dmitry, for this advice.
Is an “Import a .tde file (Tableau)” transformation in the roadmap ?
Michel


#4

No, it’s not. Tableau doesn’t provide API or specification to read .tde files. We can’t do much here.

If one day they start providing a specification we will surely add it.


#5

BTW, starting from version 4.0 it will be possible to save loaded data right in EasyMorph projects. So that when you open a project its start transformations already contain last loaded data. It will also be possible to export/import to a native EasyMorph format which will be very fast to read/write.


#6

That’s what I thought, too!


#7

I have two questions on this topic.

  1. Will easymorph be much faster from verson 4.0 onwards for loading huge CSV-files ?
  2. When will version 4.0 become available ?

Kind regards !


#8

No definite plans to speed up loading CSV files so far.

Presumably, by the end of this year or in early 2019.

The native EasyMorph file format will be available sooner than 4.0. Probably in 3.9.1. Check out our download page for short-term release plans.


#9

Hi Dmitry,

I had another question that came to my mind…
Is there a way to subset a CSV-file (or other file) before it is loaded into easyMorph or is this technically not possible ?

That could be useful to first load a subset of data in memory to see how it looks like.

Thanks in advance !


#10

@reynsnivea, Import from delimited text action has a “Maximum numbers of lines to load” option in the Advanced Options dialog.


#11

Thanks I will check this out !