Below is a method to design workflows that query APIs with pagination using the "Repeat" action. This is a simplified and improved version of what I suggested previously. It still has a few non-obvious gotchas, so I tried to cover all of them. The method also benefits additionally from recently added features.
The example is using PokéAPI (pokeapi.co) as suggested by @adambeltz.
Here is the project. It contains just 7 actions in two modules. Its logic and the design process are explained below:
api-paging.morph (6.2 KB)
The main module:
The "Get page" module:
The starting point
Start by constructing the initial dataset of the loop. Typically, it's a 1-line dataset that satisfies 2 conditions:
- It has the columns required to construct an API request. In our case, an API request only needs the URL path, so one column would suffice.
- The column name(s) must be the same as the column names in the result table of the iterated module because the "Repeat" action passes the result of one iteration to the input of the next iteration. To understand the recursive iteration mechanism of the action, please read the help article on the "Repeat" action.
As we see from the API response in our example, the example API returns the URL path of the next page in the JSON property
next. So, we construct our initial dataset as one column named
next. In this example, I used the "Create list" action for that. In a real-life project, you will probably need to construct the initial dataset from a parameter or some other user input.
The iterated module that queries the API
Now, let's create a new module called "Get page" with the "Input" action in it and add the "Repeat" action that calls "Get page" in the main module.
Useful tip: press the "Populate automatically" button to populate the input dataset from the parent module:
The iterated module must have a condition that returns an empty dataset when there is nothing more to fetch from the API. An empty result dataset signals to the "Repeat" action that it must stop iterating further and exit the loop.
In our case, when the last page is retrieved the API still returns the JSON property
next but it's empty. So we can just check if it's empty or not in the "Skip..." action (step 2).
Important! Notice that we don't exit the loop immediately when we receive the last page because we still need its result returned to the parent module (otherwise the last page will be missing in the result of the "Repeat" action). We still do one more iteration and the "Repeat" action sends the output of the last web request to the next iteration loop and only then we exit the loop (by making the "Skip..." action produce an empty dataset), before doing another web request.
If the API didn't return any property at all, we could use the columnexists() function to check that.
Important! Configure the "Skip..." action to return an empty dataset, when the condition is not fulfilled because by default it doesn't.
The web request and response
Finally, let's send the API request and parse the response. Different APIs require indicating the next page in different ways, there is no standard for that. Sometimes, it's in the URL query parameters, sometimes, it's in the request body. Some creative API designers may even use request headers for that.
Some APIs use the "offset/limit" (a.k.a. "skip/take") method to specify the next page. Some APIs return a "cursor" - a semi-random ID that points the API to the next page. Luckily, EasyMorph can handle virtually any API specification.
In our case, the next page is conveniently specified by two URL parameters,
limit, which we can extract from the URL returned by the API in the JSON property named
Since our URL path is in a column, we use the "First column value" method to specify the request path:
In the last step, we parse the response using the "Parse JSON" action:
Notice that the last step produces the column
next required in the input dataset in the next iteration of the loop.
To make the "Repeat" action collect and automatically append all responses, don't forget to switch it to the "Append and return all results" mode.
To see what will be in the input of your iterated module on 2nd iteration, right-click the last action in the result table and choose "Send output to sandbox/module" to send the result to the "Input" action of this module.
For debugging purposes, you can keep sending the result dataset to the input to see what's going on on the 3rd iteration, 4th, and so on...
If you need to edit data in the input dataset, send it to a sandbox (using the menu command described in the previous tip), edit the sandbox using the "Dataset editor", and then send the edited dataset from the sandbox back to the input.
In the example, our iterated module has only one table so I didn't flag it as the default result table as there is no ambiguity. If your iterated module has more than one table, mark the result table by right-clicking the table header and selecting "Flag as default result table". It will tell EasyMorph which exactly table is the result of the workflow. Otherwise, the "Repeat" action will return an error as it won't know what table contains the result.