In version 5.9.5 released earlier this week, we've added the concept of "Triggers" to EasyMorph Server Tasks.
Historically, tasks could be run manually, or be scheduled to either run once at a specific date and time, or run on a repeating schedule. It was also possible to run tasks when external events happened, by making an API call to EasyMorph Server, but this required significant programming skills for a start. Triggers allow you to run tasks not only on a schedule as before, but also on other events occurring, both within and externally to EasyMorph. Some examples would be when a supplier uploads a new file to a specific folder, when an email is received from a customer, or when a new record appears in a database table.
v5.9.5 introduced the first 2 Triggers, the "Schedule trigger" and the "Email Received trigger". You can see more about what they do and how to configure triggers in the blog post accompanying the release.
In future releases we plan on adding more triggers, including:
The File trigger - when a file is added or updates in a specific folder. This could be files uploaded to a space, via the new in-built SFTP server, or just about any other method of updating files onto your EM Server.
The Dropbox trigger - Like the File trigger but monitoring a file on Dropbox instead.
The Database Row trigger - when a row is added to a chosen database table.
Down to the real reason for this post. The possibilities are almost endless, so we'd love to hear from you what events or scenarios you'd like to see a trigger for? Or even any specific features you'd like to see within the above mentioned triggers?
Server Task Triggers add another promising layer for enterprise scheduling!
Really looking forward to the database row trigger. May beside adding a row, the trigger could also fire if any row in the database table is changed/deleted?
Instead of referring to another module/project to call (successor) within a project, you could also design it as a trigger (predecessor) and thereby chain/link projects. The trigger fires if a project from EM server ran (successfully or failed) and might use its predecessor's output as an input.
That's more challenging technically, because adding new rows can be relatively easily tracked via the total row count, or via an increased timestamp/rowID. However, tracking changed rows would require comparing all rows (or their hashes) in every check.
Initially, we thought about having triggers for started/failed projects, however, after giving it more thought, it became clear that it would be very hard to track such dependencies as they would be implicit.
Instead, we will offer a new mechanism called "pipelines" or "processes" that will link multiple tasks into a single pipeline (process, flow). In such a pipeline, it will be explicitly defined what task is triggered after another task finishes.
Really looking forward to the database row trigger. May beside adding a row, the trigger could also fire if any row in the database table is changed/deleted?
That’s probably achievable, but in a tricky way.
This could work in a two-phase setup.
First, you create another SQL database table (mon_table ) and configure SQL database triggers such as AFTER UPDATE and AFTER DELETE for the original table. On trigger execution, update a column value (timestamp/rowID) in mon_table .
Second, you configure an EM Server database trigger to watch mon_table for timestamp/rowID updates.
I think about this problem as being similar to trying to implement incremental extraction processes from a database.
The simplest is always the system where records can only be added. Financial accounting systems are often like this. Once I've added a record it can't be deleted and my only option is a add another record reversing out the previous. This would be handled by the "database rows added" trigger. I need only store the last primary key filed value or the last created date field value to be able to check if there are any new records.
The next simplest is a system where there is a reliable last modified dates. I stress the need for reliability as I've lost count of the number of times I've been told a last modified field was always updated, only to find this wasn't the case when the incremental load process started to diverge from the reality. If you have reliable modified dates, and they are indexed properly for performance, then the problem is essentially the same as when adding records. Rather than checking the max(id) against my cached one from the last check, I simply check the max(modified_date) against the last cached. Being as dates are stored as numeric in most DBs, they check is the same.
As @ckononenko suggested, the next best solution is if a reliable modified date can be added to the database, either in the same table or a dedicated one. SQL triggers are the most reliable way in my experience to ensure they are always updated, but they do come with an overhead to the SQL server which your DBAs can object to. If they can be added then they then fall into the above solution.
And if all of these aren't possible or reliable, as @dgudkov said, things get a lot more complex to implement into a reliable solution. There are always ways and means to achieve it, but they are rarely efficient and tend to come with known problems where a full extract is needed ever night to realign the data. Trying to build a reliable solution in EasyMorph to solve this would I'm sure be a lot harder and a lot more work. Certainly something to give more thought to though.
Thanks all for the ideas and discussions - the power of the EasyMorph community hive mind at its best! Keep them coming.
Thank you for your comments. I understand the challenges, and I'm happy to share that the standard design principles for our data tables is to include a RGDT, which covers the date (not datetime) and a LMDT, which is the date of last modification to the particular data row. As long as the trigger can also check database rows on a daily basis (what changed yesterday) this would be already of great help.
We use sharepoint as a dropzone for files as people from the business side work with sharepoint. Will it be possible to put a trigger on file uploads on sharepoint as well ?
For clarity, the database rows trigger will only monitor a table for new (and possibly modified) records and then execute a task if new are found. The morph will still need to perform the incremental load, getting just the new records and merging them with the records already extracted. The trigger won't perform the incremental load. In your scenario where you want to check once a day for records updated yesterday, the database trigger won't make much difference to simply scheduling the task to run over night.