High memory usage when saving large files in EasyMorph

Hi,

We have noticed excessive memory usage when saving large files (qvd &parquet) with EasyMorph. For example, we load multiple files partitioned by year and month, with a total of 33 million rows. The loading process is fast, but saving the imported file consumes up to 41 GB of RAM and takes approximately 30 minutes. The same process in Qlik Sense is reduced to max 10 GB and 13 minutes.
EM server has 64 GB memory while Qlik Sense 128. Related to this difference or not, is it possible to optimize this memory usage?

Best regards,
Chris

Hi Chris,

From your description it's not clear in what format you export in EasyMorph (qvd or parquet?) and in Qlik Sense (also parquet? or just qvd?).

Generally speaking, 41GB doesn't seem like the memory required to only save the dataset. Since EasyMorph is an in-memory application, it keeps all data in memory, so I assume 41GB is the memory consumed by the whole workflow, not just the export action alone. From that perspective, 41GB is normal for a 33-million rows dataset. The (very) general rule of thumb in EasyMorph is 1GB per 1 million rows, and 41/33 is in the same ballpark.

Also, Qlik Sense will always be more efficient in writing QVD files because it's a native format for Qlik Sense but not for EasyMorph. EasyMorph needs to do data conversion on the fly, when writing QVDs. The conversion is fast, in general, as it requires no compression, but it's still performance overhead.

What doesn't look right to me is the duration of export. It shouldn't take 30 minutes for the export action to write 33 million rows. Such slowdown can happen if:

  • You're exporting into a network folder, or
  • You're exporting into a cloud-synchronized folder, such as a synchronized Google Drive / OneDrive / Dropbox / etc. folder.

Network/synchronized folders poorly perform for frequent small write operations. Consider exporting into a local (not synchronized) folder first, and then copying/moving the file to the desired destination with a separate action.

Hi Dmitry,

Thank you for your clear explanations. We export data in both formats: QVD for legacy systems and Parquet for our new architecture. We conducted some checks: each action was executed manually, and we compared memory usage at the same time, ensuring no other processes were running. During the export action, we observed a memory usage of 41GB (all operations were performed via a remote connection to the server). We ran these tests for both QVD and Parquet formats, with roughly the same results.

The files are indeed saved on a network folder, but I noticed a significant difference in performance between opening a file (QVD or Parquet) and saving the same file without any transformation in either format. Perhaps the EM magicians' team could find a solution to speed up this process?

Best regards,
Chris