The Power of Curation: Apigee API hub's Data Ingestion for Enriched and Unified Enterprise API Data

In today's fast-paced digital world, organizations are increasingly challenged by a proliferating number of APIs. Managing a large and diverse API catalog across different teams, cloud providers, environments and sometimes, gateways, can become incredibly complex, leading to inefficiencies like duplication rather than reuse.

Apigee API hub steps in as a "single pane of glass" to unify APIS and API data and help manage, observe, and govern all APIs within an organization, regardless of their source.

The Data Ingestion Process is crucial for achieving a unified view. It establishes a data pipeline responsible for extracting, parsing, fingerprinting, and ingesting API data from various sources into the Apigee API hub. It's imperative that your API hub is comprehensive, encompassing all Enterprise APIs and API data regardless of their origin.

figure1.png

 Fig. 1: API hub Data Ingestion Framework
Click image to enlarge

The API hub data ingestion framework is based on 2 components

  • Plugin: Plugins allow Apigee API hub to connect and ingest APIs and API data from external sources (API Management Platform, Gateway, Git Repositories, Documentation tools …) where your APIs and API data are managed or defined. Currently, API hub provides a dedicated plugin for Apigee X and Apigee Hybrid.
  • Curation: Curation is the process of transforming and enriching the API data ingested by plugins. This process ensures that API information from different sources becomes consistent and can be effectively used for governance, discovery, and management within Apigee API hub.

Apigee API hub Curation

As mentioned, Curation is the process of transforming and enriching the API data ingested by Apigee API hub plugins.

The need for curation arises from several challenges faced in managing a vast API portfolio:

  • Incomplete Gateway Data: Often, critical information like Business Unit, API owner, or Lifecycle stage is not readily available in API gateway data, requiring enrichment from other systems.
  • Duplicate APIs: Different API design or implementation processes (SDLC) can lead to duplicate APIs from different versions or deployments (e.g., an API existing in Dev, Stage, QA, and Prod environments).
  • Disparate Deployments: Each Apigee deployment might be different, leading to duplicates if not properly managed.
  • Data Integrity and Correlation: Ingesting APIs require understanding of the API hub data model and APIs to ensure data integrity and to correlate related APIs across different tools and solutions (e.g., an API managed in Apigee, scanned by an external enterprise compliance tool, and having OpenAPI specification file in GitHub).

API hub offers two types of curation:

  1. Default Curation: This is the basic level of curation, automatically selected during plugin instance creation. Its main function is to identify and merge duplicate APIs primarily based on their display name. For example, if an "Orders" API exists in both production (project-alpha) and development (project-beta) Apigee X projects, the default curation will consolidate them into a single "Orders" API entity in API hub with two associated deployments.
  2. Custom Curation: When the default logic doesn't meet an organization's specific needs for identifying and transforming API data, custom curation can be implemented. This allows defining custom logic for:
    • Identifying duplicate APIs using more sophisticated rules beyond just the display name.
    • Custom data transformation and enrichment, such as adding more data based on the source or contextual information, or mapping data fields to internal standards.

API Fingerprinting is a crucial API hub Curation functionality designed to identify APIs across different versions and environments. This feature assigns a unique identifier, or "fingerprint," to each API. This allows API hub to differentiate between APIs, even when they are present across various platforms, providers, gateways, and tools, or when multiple versions of the same API exist.

A key enabler for custom curation is API hub's seamless integration with Google Application Integration.

Application Integration, a fully managed service, allows users to build integration flows that orchestrate and enrich API data. It provides a no-code/low-code interface that democratizes development, letting non-technical users and developers rapidly build integrations with visual tools and connectors. This accelerates development cycles, reduces costs, and boosts business agility by empowering more people to create solutions quickly.

A Practical Example: Curation Sample using Github repository

Sample Curation Use case

To illustrate custom API hub Curation, let's explore a concrete example found in the g-lalevee/apihub-curation-github GitHub repository. This repository provides a sample custom API hub Curation process built using Google Application Integration, specifically designed to ingest API data and specification files from a GitHub repository.

The curation logic implemented in this sample curation performs the following steps for each ingested API:

1. Checks for API Specification File:
It verifies if an API specification file is available in the GitHub repository.

  • If the file is not available, the API is ingested without additional enrichment, only applying the renaming part of the process.
  • If the OAS file is available, it extracts:
    • API Version data from the API specification file, OpenAPI extensions. This data is used to initialize the API hub API Version System attributes.
    • API data from an API configuration file. This data is used to initialize the API hub API System attributes.

figure2.png

Fig. 2: Init API hub Attributes from Github files
Click image to enlarge

2. API Renaming:
The API name is standardized by removing any versioning information to consolidate all versions under a single, consistent name. For instance, proxyName.v1 and proxyName.v2 would become a single “proxyName” API entity in API hub, with v1 and v2 as separate versions.

fugure3.png

Fig. 3: Fingerprint Renaming
Click image to enlarge

Note: This sample curation utilizes a single GitHub repository, defined in the Application Integration Config Variables, to store all API specification and configuration files. An alternative approach “one repository = 1 API” could involve dynamically defining a separate repository for each API based on its name.

Setting up the GitHub Curation Sample

To implement this sample, you need a Google Cloud Platform (GCP) account with both Apigee API hub and Application Integration activated, along with a GitHub account.
See below, components involved in this sample curation process.

figure4.png

Fig. 4: API hub curation sample overview
Click image to enlarge

The setup process involves several steps:

  1. GitHub Setup:
    You'll need a GitHub repository to store your configuration and API specification files.
  2. Google Integration Connection Setup:
    Application Integration connects to your GitHub repository via a GitHub Integration Connection. This connection needs to be configured using the credentials from a GitHub App.
  3. Google Application Integration Setup:
    Create and deploy a new integration from apiHubCurationGithub-v1.json file.
  4. Google Apigee API hub Setup:
    In the API hub interface, from Settings, create Curation linked to the previously deployed Integration.

Details of both the creation and testing processes are available in the Github repository.

Sample Curation in Action

Then you can run the curation manually from the API hub setting menu.

In the following video, you will see:

  • Initial configuration
    • API hub setup: Plugin Apigee X and hybrid, Curation apiHubCurationGithub-v1
    • Application Integration curation Process apiHubCurationGithub-v1
    • Apigee and API hub initial content: 2 Apigee proxies (cl-AudienceAnalyse-v1, cl-AudienceAnalyse-v2) and no cl-AudienceAnalyse API in API hub.
  • Run
    • Run manually the curation from API hub plugin menu
    • Application Integration execution logs
    • The API hub now contains one API with two versions. Both the API and VERSION attributes have been initialized, and OpenAPI specifications are included.

 


Fig. 5: API hub curation process in action

 

Unlock Your API's potential: Curate for Clarity and Efficiency

One of the biggest challenges in a diverse API ecosystem is correlating API data across different sources and uniquely identifying an API regardless of where its information resides. Custom API hub Curation, powered by Google Application Integration, empowers you to:

  • Establish robust data mapping: Define clear relationships between data points from various sources, ensuring that information about a single API is linked and consistent.
  • Implement unique identifiers: Assign a consistent, singular identifier to each API that persists across Confluence, GitHub, and any other integrated system. This eliminates confusion and ensures that everyone is always referring to the same API.
  • Automate data synchronization: Keep your curated API portfolio up-to-date automatically as changes occur in your source systems, reducing manual effort and potential errors.

By centralizing and intelligently connecting your API data, you create a single source of truth for your entire organization. This not only makes your APIs easier to find and understand but also fosters a culture of reuse, reduces redundant work, and accelerates your development cycles.

Apigee API hub’s enriched API data becomes even more powerful when considering the current surge in AI agents and agentic platforms, as these advanced tools critically depend on reliable, accessible connections to your enterprise systems and data to truly deliver on their promise of boosted efficiency.

But a common hurdle is getting these agents to work seamlessly with varied API data. Apigee API hub's Curation bridges this gap. It transforms your raw API catalog by cleaning, standardizing, and enriching it, making the data instantly accessible and more valuable.

This directly supercharges the effectiveness of every agent your teams develop.

Now, it's time to simplify your API landscape.

Ready to begin, as easy as 1-2-3?

  1. Enable API hub: you will automatically see proxies from your Apigee X or Apigee hybrid organization
  2. Setup this sample custom curation if you are storing API specification in a Github repository
  3. Voilà! You have a single place for your enterprise truth, to discover, manage and govern APIs, and powerfully fuel your agents !

Version history
Last update:
Wednesday
Updated by: