Server Capacity for a Marketplace

Hi,
I was thinking of using EM for a Marketplace, goal would be build the backend of a PIM/DAM system and the Order Management system backend + Order processing.

Wondering if anyone has any experience in using EM in this? Curious to hear their thoughts.

I implemented EM as a complement to a PIM as an ETL tool, which works flawless, but more curious on the Order and Product management level.

We would be talking in multiple vendors (Couple of hundreds) using EM to push/pull product data and model data.

In terms of orders, a couple of thousand per day potentially which EM would push+update+pull the orders every 5min via batch schedules.

So probably there would be 2 EM server entities separated, one for PIM/DAM and the other one for ORDER flow.

Thank you very much.

Hi @JoMar -
Interesting project! I managed all integrations for our home grown PIM and our cloud based ERP and ecommerce platforms.

In our case it was an exercise of gathering solid requirements up front and then having the time to execute them, no different than any software/data product implementation.

Our volume may be different than yours though so I’ll clarify

  • 3,000 SKUs per year (Images ect could change and I had sync process that would look for changes and identify them. Pushing/pulling from ecommerce to verify and keep in sync)

  • Order Management from ecommerce to ERP. At peak 7,000 Orders per hour which the ERP side had a hard time with so I had to parse it in EM to make it work for us, retry, error handling ect at about 500 per hour)

Honestly, I haven’t found anything EM hasn’t been able to do for us in that space but we’ve made a large effort to ensure we do things in a consistent way and documenting the process so we can repeat it. I think every CIO/President wants to buy a product that can just do this sort of work but each integration and format is unique.

Hi @adambeltz ,
Great feedback. Mostly we want to create a niche Marketplace in a specific country, considering the expected customer retention rate vs cost vs performance, this is where we see the best ROI, but also long term solution. Being used to deal with ERPs such as SAP, Salesforce and other Ecommerce ERPS, I prefer the No-code/low-code due to the cost and flexibility.

Our stack will be simple in the beginning WordPress/Shopify + EM + AWS. Then we scale up according to the Product Range we add and customer retention vs order volume.

Curious to what you have used for Image storage?

We are thinking of AWS S3 (directly integrated with EM) mounted with Cloudfront for CDN.

Similar, we don't expect more than 1000/3000 skus in first years, Images we do not expect changes as we want a strict onboarding process.

7000 orders per hour sounds insane, we expect max at peak 2000 orders and the batches are because we will have a temp database to accept orders (To do load balance on pushing the orders) and then send to main DB (we are still evaluating this), one is temp, the other is the "Mother DB".

Very cool! Image storage for us was a wonky Rackspace concept. I had to have a translation table to map the productid to the image path. Not my best work but if you control it in AWS it feels like you’d have a nicer experience.

If I understood correct you get the image and build a path according to the product id for future reference correct, so the call URL itself? Idea is to build the path when getting the image from the Manufacturer/3rd Party and create the mapping in EM itself and create the URL, when uploading to AWS, when you upload the image you define the path for that image (full or partial in case you also want to separate in different folders).

So I created an image reference table on one of our databases so I could track it outside of one tool. Reporting tools like Power BI/Metabase will allow you to insert a url and have it show as an image for the end users to see what images are assigned

The concept you are listing is accurate though in the approach we used.

my main issue was around naming of images. Example SKU.jpg and for alts SKU_2.jpg so I could keep it all in line.

@JoMar, I’m curious if you’re considering a use of the custom API functionality in EasyMorph Server for the purpose that you described? Although, I’m not that familiar with PIMs in general so my question might be irrelevant.

I see your point. In that case I think EM with AWS S3 would be perfect to do that mapping and management, and keeping images in a separated DB just to “serve” them.

Hi @dgudkov,
I was going the other day through this How to create API endpoints without coding.

And this is definitely an interest for the project, specially to allow the vendors/merchants to push/pull data into our DBs.

As an example:

Vendor A sends their images url and product data into our DB via the EM custom endpoint, in EM, we do the proper image mapping to our new URL, we use a download command to get the images into our S3 using the mapping and path built in EM.

So custom point "api.siteXYZ.com/{{vendor id}}/Product_data" endpoint. Each vendor has ID and credentials to submit data.~

Would this be possible?

What needs more investigation is also:

  1. The only thing I would need to think is use cases vs the Gateway limits. Some data would have to be GET/Pull via EM (using the schedule around every 5/10min), others PUSH/POST via the custom endpoints for real time data (More based on event driven). That is why the initial TEMP DB, where it gets the orders and EM can every 5min pickup the ones from that DB in "Status" - Pending.

  2. What is the rate limit based on custom endpoints? How could I guarantee that the influx of orders or product data will not go timeout? Is it based only on the server capacity and bandwidth or EM itself imposes a limit in the gateway?

PIM platform is no more than a sort of DB, but allows User policy for access (Can be managed from frontend), has a simple frontend to be able manage data, also allows business logic for data transformation.

With a Custom frontend (E.g. React) + EM using ETL and custom endpoints + Database for Product data and image URLS considering you use AWS S3 (Even DSETS could be used for this), you can actually build quite a nice PIM.

I definitely think EM can do wonders here. Scaling up the server on the fly and using dsets (Considering the billions of rows) really makes the difference.

Thank you.

It's only based on the machine performance and workflow complexity. The gateway has no restriction on the number of requests it can process. In our tests, we processed near 1 million (very light) requests per day. We don't position the API service as top-performance in general, but it's quite capable.

Thanks @dgudkov ,

I think for now, for real time data to ensure performance, only 3 use cases:

  1. Prioritize orders (Next day delivery), orders this type could go via gateway (rest via regular batch)
  2. Delta Stock exchanges. To update the delta stocks, this is a must to avoid cancellations
  3. Product Pricing (Flexibility of quickly updating a price)

1 Million I would see more than enough (Order data is also very small files), estimating 2000 orders per hour, around 10%/20% would be next day delivery, so around 400 orders per hour that would use the gateway.

Yes, that is possible. Uploading (image) files isn't currently possible with the API service, but if you pass a URL where the image can be downloaded from, then a workflow can download it.

I would only suggest making this process asynchronous: a request to an API endpoint adds a "task" to a "queue" and returns 200 (OK). Then a scheduled Server workflow processes such tasks (i.e. downloads images, puts them into S3, etc.) from the "queue", batch by batch.

PS. If you would like to have a call and discuss the use of API endpoints for the task, feel free to reach out to us.