I'm pulling data from the YouTube API, and there is a maximum of 50 records per page, and to request the next 50 records you have to provide the "pageToken" parameter.
So I can pull the first 50 records no problem and get the following results for eg:
If I manually update the parameter for "pageToken" with the "EAAaHlBUOkNESWlFRVV5T1RNM01Ua3hOakl6TnpNeU5UTQ" I get the next set of results, etc.
I'm obviously trying to figure a out a way of doing this with some kind of iteration, so it takes the latest value from "nextPageToken" to request the next page, and when the "nextPageToken" column does not exist, it stops.
Yes, there is indeed, by moving your request to the API to a separate module and then using the "Repeat" action to call it over and over until there are no more results.
I think there might be an issue here as there will always be a results page, only the nextPageToken field will not be present in the last page of results.
I'm trying to think how I can keep the last set of results whilst stopping the process when:
not columexists([nextPageToken])
is true?
YouTube API has this nextPageToken and previousPageToken which have to be added as a variables when requesting a set of results which makes this a little more complicated I think.
I used alot of API and I feel the pain using a paged one!
Repeat is a great action but sometimes could be tricky, in your case I suggest as follow:
MAIN MODULE
Just before the Repeat action, create a new field called "#Command". In this field you insert the part of the URL that will be changed, in your example
/playlistItems?part=snippet&playlistId=UU_ZR3lKBFJM-VX-JTgF7dfA&maxResults=50&key=[APIKey]&pageToken=
After that you insert the repeat action
REPEATED MODULE
You'll get the input data using the input action
Put a "Skip actions on condition" with the condition "isempty(#Command)"
Put a "Web request" action that will call the URL+#Command
Do your things on the returned data
Put a "Modify column", that will modify #Command as follow: if(isempty(FieldOfTheNextToken),"","/playlistItems?part=snippet&playlistId=UU_ZR3lKBFJM-VX-JTgF7dfA&maxResults=50&key=[APIKey]&pageToken="&FieldOfTheNextToken
Understanding the Repeat could take a while, but it's perfect to manage paged API!
I hope this could help
the issue is that the first time you get the URL, the pageToken is blank, so if the skip on condition if the #Command is empty, it won't run at all.
HOWEVER - you got me thinking!
The way I cracked it was to add a column at then end of the repeated module called "nextPageToken" with the value END.
This means that if the "nextPageToken" column is present, the new column is called "nextPageToken(2)" and is effectivley ignored.
But in the last set of results, the newly created column is called "nextPageToken" with the value END in it, which is passed back to the repeat action.
I put in a Skip condition before the Web Request action that if the nextPageToken = 'END' then skip, which creates the blank table and therefor stops the Repeat.
Thank you for getting the old cogs turning in my head again!
When I get a few minutes I'm going to pull together some examples for paginated APIs. There are a few different ways they work and having an example for each will hopefully help others trying to do the same.
Wondering if you could use the columnexists() function in the skip action to do the check instead without having to add additional dummy fields?
@mattf it'll be useful to have some examples of paginated APIs, in my last work I see:
Paginated API with a nextToken (needed to be passed in a GET method)
Paginated API with the number of page/number of current last items
Paginated API that could not have all pages with data.
The last scenario was the worst: an API that returns 100 elements each page, but could return empty data in some iteration. I would be greateful if you can show me the proper way to manage this case, I succesful did it but I know that maybe my way could be tricky than Easymoph team's one!
in the submodule called by the repeat usually you have this structure:
Read the input data
Skip action that verify the exit condition
Web request
Do some fun things on data
After that the repeat action will ... repeat (sorry for the joke) the entire loop till the end.
In the last point of the list usually we're gonna parse the JSON response, clean/format data, keep access token and nextPageToken etc. but usually we'll do at least one or two filter and the result of these filters can be an empty list. Example: we call a DAM API that will returns all products and their data:
Item
Images
Product A
image1.png
Product B
Product C
image2.png, image3.png
Product D
imageX.png
Let's say, for example, that we need the product-image association: we're gonna filter the "images" column removing the empty ones, ok?
In the example we'll have 3 records, but it could happen that in the paged list of products no one will have an image so the filter will return an empty table.
In that case, at the end of the elaboration, the new repeated cycle will have a true exit condition.
To manage this we created a fake record that persist in the table and not excluded by filter, in that way at the end the accessToken and nextPageToken are preserved every cycle.
That's what I thought as well - generate a fake record and then filter it out after the "Repeat" action exits.
Alternatively, collect all records (with empty images or not), and do all the additional filtering/calculation after the "Repeat" action exits, on the full dataset.
I'm glad that my solution is the same as yours, about the second option: how you can manage to exit the repeat without checking & skipping? If repeat reach the "n" cycles without exiting it goes in error
I misunderstood you too: you suggest to get all the raw JSON and only after the "Repeat" action, manage the data. It could be useful, yes, maybe a little bit more time and resource consuming in execution?