Notes by Doctor Droid

Why API integrations break and how to avoid them?


2 min read

Cover Image for Why API integrations break and how to avoid them?

Today, our products are deeply dependent on third-party integrations to run successfully. Common integrations that we have come across in building products include payment gateways, communication APIs, CRMs, client integrations and banking APIs.

Most common reasons for issues with integrations

  1. Poor Provisioning by Service Provider

    While multi-tenant architectures with distributed load are the recommended way to build API products, very often we end up with issues in reaching service providers because of a lack of efficient provisioning. Peak traffic apart, sometimes even during a lean period, we have had vendor services going down due to a minor increase in the request volumes

  2. Latency breaches → Timeouts

    To protect existing workflows and thread pools from choking, we have pre-defined thresholds with service providers for APIs. When these latencies are breached, it leads to timeouts of requests at our end

  3. Unhandled error codes

    A new error code, generated due to an edge case or otherwise, which is not managed during exception handling can impact our workflows

  4. Incorrect response

    Changes in third-party’s APIs can lead to unexpected request responses and bodies, causing downstream APIs to reject the response, throw errors or respond unexpectedly

  5. Webhook Deactivations

    Modern integrations depend on webhooks to receive data from third-party. We have seen instances where webhooks have been deactivated at the service provider’s end without appropriate notifications

Remediation Strategies

  1. Monitoring API latencies, throughput and error rates

  2. Multiple integrations in case of critical services - As you read above, a service provider’s APIs can falter due to multiple reasons. In our attempts to normalise dependency, multiple integrations should be set in place for critical services

  3. Setting up fallback options in case of failures - Setup circuits with a default fallback option in case of failure

  4. In case of requests consistently exceeding latency thresholds, dropping the requests would be critical to avoid queuing

  5. Enable webhook monitoring and set up alerts for the sudden dip in transaction frequency

  6. Store API response bodies in logs or databases to be able to retrieve and identify issues in data in case of debugging needs

Introducing Outsiders: Monitor third-party integrations

Outsiders help you monitor 3rd party integrations - both performance and context monitoring.****

Context Monitoring

  • Change in response body format or variables

  • Data variation / sudden change for a specific parameter

  • A mismatch between the status code and API response body

  • Mapping between API call and webhook response

Performance Monitoring:

  • Error rates and error codes

  • API latency

  • Ack to webhook delay

Request access to our product in beta here!