GenAI and API Management: Part 2b - A Technical Guide on Monetizing LLMs with Apigee

image4.png

Continuing from Part 2a - Need to Monetize LLM API’s

By leveraging Apigee, enterprises can package LLM access into API products and monetize their usage effectively.

In this blogpost, we are going to discuss how to set up Monetization in Apigee.

A Rate Plan is associated with an API Product. If an API Product has a Rate Plan associated, it is considered to be monetized.

In order to gain access to the desired APIs, consumers (App Developers) need to buy a subscription for a Rate Plan and then create an App using the associated API Product(s).

Screenshot 2025-01-13 4.49.04 PM.png

Below are the key steps to consider when monetizing LLM usage with Apigee.

1. Build LLM-powered APIs with Apigee

  • Proxy Creation: Start by creating an API Proxy in Apigee with the required api management capabilities or policies (such as Verify API Key, CORS, etc.) to be executed before forwarding a request to the chosen LLM (Vertex, Anthropic, HuggingFace, etc.).The API Proxy should be attached to the desired API Product.
  • Target Endpoint Configuration: Configure the Target Endpoint for each API Proxy to point to the respective LLM provider's API endpoint.
  • Implement Token-Based Monetization in Apigee following the below-stated steps.
    1. Prompt Extraction: Apigee extracts the client’s prompt from the incoming API request payload
    2. Token Counting: The number of tokens in the prompt is determined. This can be achieved by leveraging the LLM's built-in token counter (if available) or by implementing a custom JavaScript policy within Apigee to estimate the token count.See details about Apigee's JavaScript Policy here
    3. Cost Calculation: Based on the token count and the LLM's pricing model (e.g. $2 per 1 million tokens for Anthropic), the cost of the request is calculated.
    4. Forwarding and Response: The prompt is then forwarded to the LLM's API. Apigee receives the LLM's response and relays it back to the user.

This will allow for precise tracking and billing based on actual LLM service usage, ensuring fair and transparent monetization of LLM-powered applications.

2. Create LLM Monetization Policies in Apigee

The following policies are built into the API Proxy to implement monetization.

Monetization Limit Check Policy - This policy will check that a subscription exists for the client making the API call and they have not exceeded their balance. This policy should be added after the Verify API Key policy. More details on this policy can be found in the official documentation . If a developer has not purchased a subscription to the associated API product, access to the monetized API is blocked, and a 403 status is returned with a custom message. The diagram below shows a simplified representation of the monetization limit check policy in Apigee.

Screenshot 2025-01-13 5.01.03 PM.png

For a view of the debug, please see Figure 1 in the appendix.
Below is a code sample of this policy:

 

 

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<MonetizationLimitsCheck continueOnError="false" enabled="true" name="MonetizationLimitsCheck-1">
 <DisplayName>Monetization-Limits-Check-1</DisplayName>
 <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
 <FaultResponse>
   <Set>
     <Payload contentType="text/xml">
       <error>
         <messages>
           <message>Usage has been exceeded ({mint.limitscheck.isRequestBlocked}) or app developer has been suspended</message>
         </messages>
       </error>
     </Payload>
     <StatusCode>403</StatusCode>
   </Set>
 </FaultResponse>
</MonetizationLimitsCheck>

 

 

Monetization Quota Policies: 2 Quota policies are required for monetization, as explained below.

Quota Policy with Enforce: Utilize Apigee's Quota policy to enforce usage limits based on token consumption. This policy allows one to set limits per API Key, Developer, or App, over specific time intervals (e.g. 10,000 tokens per day).

Quota Policy with Count : This will decrement the count of tokens as API calls come through. One could also attach the policies in a Shared Flow and trigger the API Proxies that one would like to be monetized, triggered from the Shared Flow. More details on this policy can be found in the official documentation.

The diagram below shows a simplified representation of the (Request Path) Quota policy in Apigee. For a view of the debug, please see Figure 2 in the appendix.

Screenshot 2025-01-13 5.04.25 PM.png

The diagram below shows a simplified representation of the (Response Path) Quota policy in Apigee. For a view of the debug, please see Figure 3 in the appendix.

Screenshot 2025-01-13 5.06.28 PM.png

Below is a code Sample of the Quota policy:

 

 

<ProxyEndpoint name="default">
   <PreFlow name="PreFlow">
       <Request>
           <Step>
               <Name>Enforce-Only</Name>  <!--First quota policy enforces quota count -->
           </Step>
       </Request>
       <Response>
           <Step>
               <Name>Count-Only</Name>   <!-- Second quota policy counts quota if call is successful -->
               <Condition>response.status.code = 200</Condition>
           </Step>
       </Response>
       <Response/>
   </PreFlow>
   <Flows/>
   <PostFlow name="PostFlow">
       <Request/>
       <Response/>
   </PostFlow>
   <HTTPProxyConnection>
       <BasePath>/quota-shared-name</BasePath>
   </HTTPProxyConnection>
   <RouteRule name="noroute"/>
</ProxyEndpoint>

 

 

Data Capture Policy: This policy captures the monetization consumption data for transactions that are monetized. It captures the currency (currently only USD), rate, rate multiplier (e.g. number of tokens in the transaction) and revenue share if any.

The diagram below shows a simplified representation of the data capture policy in Apigee. For a view of the debug, please see Figure 4 in the appendix.

Screenshot 2025-01-13 5.09.46 PM.png

Below is a code sample of the Data capture policy:

 

 

<DataCapture name="DC-monetization" continueOnError="false" enabled="true">
  <IgnoreUnresolvedVariables>true</IgnoreUnresolvedVariables>
  <Capture>
    <Collect ref="monetization-currency" default="USD"/>
    <DataCollector scope="monetization">currency</DataCollector>
  </Capture>
  <Capture>
    <Collect ref="monetization-success" default="false"/>
    <DataCollector scope="monetization">transactionSuccess</DataCollector>
  </Capture>
  <Capture>
    <Collect ref="monetization-multiplier" default="1"/>
    <DataCollector scope="monetization">perUnitPriceMultiplier</DataCollector>
  </Capture>
  <Capture>
    <Collect ref="monetization-revenue-share" default="0"/>
    <DataCollector scope="monetization">revShareGrossPrice</DataCollector>
  </Capture>
</DataCapture>

 

 

3. Define Rate Plan and API Product in Apigee

  • Rate Plan Setup: Rate Plans in Apigee that align with the monetization strategy can be created in Apigee. One can define different tiers in the Rate Plan with varying token allowances, overage charges, and pricing models (e.g., pay-as-you-go, tiered pricing, or subscription-based). Check out the official documentation for details on managing Rate Plans for API Products.

    Considering Large Language Models (LLMs) charge based on the number of tokens processed in a request, Apigee can be configured to accurately track this consumption in real-time. This token count by the LLMs is then fed into the Rate Plan in Apigee. By associating the dynamic token usage with a specific Rate Plan, Apigee allows API providers to have fine-grained control over monetization, allowing them to charge different rates based on usage tiers, subscription levels, or even the specific LLM model employed.

  • API Product Bundling: Package the LLM API Proxies into the desired API Product(s) with an associated Rate Plan. This allows one to offer different combinations of LLMs and token allowances to cater to diverse customer needs. Creating multiple of such API Products as product tiers within Apigee allows for differentiated AI products with different token quotas, as shown in the diagram below.

    Screenshot 2025-01-13 5.12.22 PM.png

  • API Key: API Keys can be issued to the Developers to track and control access for the LLM APIs. The API Keys can also be stored securely within Apigee using an encrypted Key Value Map.
  • Developer Portal Integration: API Products can be published on the Integrated Developer Portal or the Drupal Developer Portal. Clear documentation about the APIs, pricing details, and code samples to facilitate easy adoption of APIs can also be published on the portals. Check out this reference documentation for getting details on integrating the monetization module with Drupal Developer Portal. The screenshot below shows a Drupal Developer Portal for serving Apigee API Products. 

Screenshot 2025-01-13 5.15.21 PM.png

4. Metering

Metering, in the context of LLM APIs, is like having a precise measuring instrument for AI. It allows one to track and quantify exactly how their valuable language models are being used, providing crucial data for understanding consumption patterns and optimizing their monetization strategy.

Think of it as a detailed logbook that records every interaction with the LLM APIs, capturing essential information such as:

  • Request Counts: How often are users accessing the LLMs? This can be broken down by specific endpoints or operations to identify popular features.
  • Payload Size: Are users sending small prompts or large chunks of text? This reveals the volume of data processed by the LLMs.
  • Processing Time: How long does it take for the LLMs to generate responses? This helps monitor performance and identify potential bottlenecks.
  • User Identification: Who is accessing the LLMs and how often? This allows for granular tracking and differentiated pricing based on user behavior.

This granular data empowers one to make informed decisions about their LLM APIs, such as:

  • Implementing fair and transparent pricing: Charge users accurately based on their actual consumption of the LLM resources.
  • Creating tiered pricing plans: Offer different levels of access and features based on usage limits or LLM capabilities.
  • Identifying power users: Recognize the most valuable customers and offer them customized plans or volume discounts.
  • Optimizing LLM performance: Identify usage patterns that impact performance and adjust the infrastructure accordingly.

With Apigee's robust metering capabilities, a user can gain deep insights into their LLM usage, enabling them to effectively monetize their AI innovation and drive business growth.

5. Operational and Security Considerations for LLM Monetization

  • Tokenization Strategy: It is important to ensure accurate token counting by using the same tokenization method as that of the LLM provider in Apigee. However, different LLMs handle token counting differently. For instance, while Google's Vertex models conveniently include the token count in their responses, other providers, such as HuggingFace, may not consistently provide this data. This can be addressed by using a custom Javascript policy within Apigee to dynamically calculate the token count for these LLMs, ensuring accurate billing regardless of the provider.
  • Rate Limiting: Implement rate limiting policies to prevent abuse and ensure fair usage across all the consumers.
  • Security: Enforce robust security measures, including API Key validation, authentication, and authorization, to protect the APIs and user data.
  • Monitoring and Analytics: Utilize Apigee's monitoring and analytics capabilities to track API usage, identify trends, and optimize the monetization strategy.

6. Enabling comprehensive analytics and logging

Apigee provides detailed insights into API usage, performance, and error rates, allowing one to track LLM interactions, identify trends, and troubleshoot issues effectively. This includes tracking token consumption, latency, and error rates for each LLM. See this page for a comprehensive guide to using the DataCapture policy to collect custom data from API proxies. This article describes best practices for using data collectors and data capture policy

Benefits of Monetizing LLMs with Apigee

Granular Control and Flexibility: Apigee allows one to precisely control access to LLMs and tailor pricing to different consumer needs. One can define rate limits, quotas, and pricing tiers based on usage, features, or even specific LLM models. This level of granularity enables one to create a variety of offerings and cater to diverse customer segments.

Simplified Billing and Revenue Generation: With Apigee's integrated monetization features, one can automate billing processes and easily track revenue streams. This simplifies operations and reduces administrative overhead associated with managing subscriptions and payments.

Scalability and Reliability: Apigee's robust infrastructure ensures your LLM APIs can handle growing demand while maintaining high availability. This allows one to scale their services seamlessly and provide a consistent experience for your consumers.

Enhanced Security: Apigee provides comprehensive security features to protect the LLMs and associated data. One can leverage authentication, authorization, and threat protection mechanisms to secure their API endpoints and prevent unauthorized access.

Streamlined Developer Experience: By packaging LLMs into user-friendly API Products, one can empower developers with easy access and integration capabilities. This accelerates adoption and encourages innovation within the developer community.

To see how to monetize LLMs with Apigee in Action see this short video clip.

Conclusion

Monetizing LLMs through Apigee provides a powerful and efficient way to unlock the value of the AI investments. By combining the capabilities of LLMs with Apigee's API Management Platform, businesses can create new revenue streams, optimize costs, and deliver innovative solutions to the market. The token-based billing approach, coupled with Apigee's flexible Rate Plans and robust infrastructure, enables granular control, scalability, and a seamless developer experience. As LLMs continue to evolve, Apigee empowers businesses to stay ahead of the curve and capitalize on the transformative potential of AI.

Enhanced Security: Centralized credential management and prompt sanitization ensures robust security and compliance for both end-users and developers.

Effective API Management: Features like rate limiting, quotas, and analytics optimize resource usage and enable monetization strategies, allowing one to generate revenue from your LLM APIs.

Appendix

Figure 1

Debug showing monetization Limit check policy with variables Mint.limitscheck.is_subscription_found and mint.limitcheck.status_message

Screenshot 2025-01-13 5.17.57 PM.png

Figure 2

Debug showing (Request Path) Quota policy with variables ratelimit-QU-MonetizationEnforcerAllowedCount and ratelimit-QU-MonetizationEnforcerAvailableCount

Screenshot 2025-01-13 5.19.19 PM.png

Figure 3

Debug showing (Response Path) Quota policy with variables ratelimit-QU-MonetizationCount.exceedcount and ratelimit-QU-MonetizationCount.used.count

 Screenshot 2025-01-13 5.20.40 PM.png

Figure 4

Debug showing Data Capture Policy with variables mint.mintng_currency and mint.mintng_tx_success

Screenshot 2025-01-13 5.22.01 PM.png

 

Version history
Last update:
2 weeks ago
Updated by: