Hello,
As developer, we would like to have an option to load only the first x rows of all tables in a project where x could be a parameter. This would speed up our developments significantly.
Best regards,
Chris
Hello,
As developer, we would like to have an option to load only the first x rows of all tables in a project where x could be a parameter. This would speed up our developments significantly.
Best regards,
Chris
Thanks Bolud, but not exactly. We’re looking for a way to limit all tables — whether they come from files or databases — in one place, instead of having to manually set “keep only the first x rows” on each source table. Something like the Debug mode of Qlik Sense where you can limit all loads to x rows.
I see. I agree that could be a useful feature.
Hi everyone,
While the proposed feature may sound interesting and useful during development, I believe it hides a few pitfalls that are worth considering before applying it broadly across all data tables in a project.
From my experience, other platforms (for example, Tableau Prep) also adopt a similar approach to speed up prototyping. However, this kind of strategy can introduce some significant risks:
Loss of key records – If the sample of rows is not deterministic but random, you risk excluding records that act as logical connectors (join keys) between tables. This can alter the workflow structure and hide potential data consistency issues that would otherwise be detected.
Partial analysis and misleading debugging – Working on a random subset of data can lead to incorrect conclusions during design and debugging phases, as certain errors or anomalies might only appear when the full dataset is used.
Data cleaning and normalization – Some data cleaning or deduplication operations rely on the full distribution of values. Limiting the number of rows might prevent the identification of patterns or inconsistencies, making data validation less effective.
Safer alternatives – A more controlled approach could be to:
Load deterministic subsets (e.g., using a date filter or specific IDs).
Introduce a global parameter that enables/disables sampling only on selected tables.
Clearly document when and why sampling is applied, to avoid ambiguity in results.
In summary, this feature could definitely speed up development, but it should be used carefully: overly truncating the data might compromise the overall quality and reliability of the project.