Custom Callbacks
For PROXY Go Here
Callback Classโ
You can create a custom callback class to precisely log events as they occur in litellm.
import litellm
from litellm.integrations.custom_logger import CustomLogger
from litellm import completion, acompletion
class MyCustomHandler(CustomLogger):
def log_pre_api_call(self, model, messages, kwargs):
print(f"Pre-API Call")
def log_post_api_call(self, kwargs, response_obj, start_time, end_time):
print(f"Post-API Call")
def log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Success")
def log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Failure")
#### ASYNC #### - for acompletion/aembeddings
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success")
async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Failure")
customHandler = MyCustomHandler()
litellm.callbacks = [customHandler]
## sync
response = completion(model="gpt-3.5-turbo", messages=[{ "role": "user", "content": "Hi ๐ - i'm openai"}],
stream=True)
for chunk in response:
continue
## async
import asyncio
def async completion():
response = await acompletion(model="gpt-3.5-turbo", messages=[{ "role": "user", "content": "Hi ๐ - i'm openai"}],
stream=True)
async for chunk in response:
continue
asyncio.run(completion())
Common Hooksโ
async_log_success_event
- Log successful API callsasync_log_failure_event
- Log failed API callslog_pre_api_call
- Log before API calllog_post_api_call
- Log after API call
Proxy-only hooks (only work with LiteLLM Proxy):
async_post_call_success_hook
- Access user data + modify responsesasync_pre_call_hook
- Modify requests before sending
Example: Modifying the Response in async_post_call_success_hookโ
You can use async_post_call_success_hook
to add custom headers or metadata to the response before it is returned to the client. For example:
async def async_post_call_success_hook(data, user_api_key_dict, response):
# Add a custom header to the response
additional_headers = getattr(response, "_hidden_params", {}).get("additional_headers", {}) or {}
additional_headers["x-litellm-custom-header"] = "my-value"
if not hasattr(response, "_hidden_params"):
response._hidden_params = {}
response._hidden_params["additional_headers"] = additional_headers
return response
This allows you to inject custom metadata or headers into the response for downstream consumers. You can use this pattern to pass information to clients, proxies, or observability tools.
Callback Functionsโ
If you just want to log on a specific event (e.g. on input) - you can use callback functions.
You can set custom callbacks to trigger for:
litellm.input_callback
- Track inputs/transformed inputs before making the LLM API calllitellm.success_callback
- Track inputs/outputs after making LLM API calllitellm.failure_callback
- Track inputs/outputs + exceptions for litellm calls
Defining a Custom Callback Functionโ
Create a custom callback function that takes specific arguments:
def custom_callback(
kwargs, # kwargs to completion
completion_response, # response from completion
start_time, end_time # start/end time
):
# Your custom code here
print("LITELLM: in custom callback function")
print("kwargs", kwargs)
print("completion_response", completion_response)
print("start_time", start_time)
print("end_time", end_time)
Setting the custom callback functionโ
import litellm
litellm.success_callback = [custom_callback]
Using Your Custom Callback Functionโ
import litellm
from litellm import completion
# Assign the custom callback function
litellm.success_callback = [custom_callback]
response = completion(
model="gpt-3.5-turbo",
messages=[
{
"role": "user",
"content": "Hi ๐ - i'm openai"
}
]
)
print(response)
Async Callback Functionsโ
We recommend using the Custom Logger class for async.
from litellm.integrations.custom_logger import CustomLogger
from litellm import acompletion
class MyCustomHandler(CustomLogger):
#### ASYNC ####
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Success")
async def async_log_failure_event(self, kwargs, response_obj, start_time, end_time):
print(f"On Async Failure")
import asyncio
customHandler = MyCustomHandler()
litellm.callbacks = [customHandler]
def async completion():
response = await acompletion(model="gpt-3.5-turbo", messages=[{ "role": "user", "content": "Hi ๐ - i'm openai"}],
stream=True)
async for chunk in response:
continue
asyncio.run(completion())
Functions
If you just want to pass in an async function for logging.
LiteLLM currently supports just async success callback functions for async completion/embedding calls.
import asyncio, litellm
async def async_test_logging_fn(kwargs, completion_obj, start_time, end_time):
print(f"On Async Success!")
async def test_chat_openai():
try:
# litellm.set_verbose = True
litellm.success_callback = [async_test_logging_fn]
response = await litellm.acompletion(model="gpt-3.5-turbo",
messages=[{
"role": "user",
"content": "Hi ๐ - i'm openai"
}],
stream=True)
async for chunk in response:
continue
except Exception as e:
print(e)
pytest.fail(f"An error occurred - {str(e)}")
asyncio.run(test_chat_openai())
What's Available in kwargs?โ
The kwargs dictionary contains all the details about your API call:
def custom_callback(kwargs, completion_response, start_time, end_time):
# Access common data
model = kwargs.get("model")
messages = kwargs.get("messages", [])
cost = kwargs.get("response_cost", 0)
cache_hit = kwargs.get("cache_hit", False)
# Access metadata you passed in
metadata = kwargs.get("litellm_params", {}).get("metadata", {})
Key fields in kwargs:
model
- The model namemessages
- Input messagesresponse_cost
- Calculated costcache_hit
- Whether response was cachedlitellm_params.metadata
- Your custom metadata
Practical Examplesโ
Track API Costsโ
def track_cost_callback(kwargs, completion_response, start_time, end_time):
cost = kwargs["response_cost"] # litellm calculates this for you
print(f"Request cost: ${cost}")
litellm.success_callback = [track_cost_callback]
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello"}])
Log Inputs to LLMsโ
def get_transformed_inputs(kwargs):
params_to_model = kwargs["additional_args"]["complete_input_dict"]
print("params to model", params_to_model)
litellm.input_callback = [get_transformed_inputs]
response = completion(model="claude-2", messages=[{"role": "user", "content": "Hello"}])
Send to External Serviceโ
import requests
def send_to_analytics(kwargs, completion_response, start_time, end_time):
data = {
"model": kwargs.get("model"),
"cost": kwargs.get("response_cost", 0),
"duration": (end_time - start_time).total_seconds()
}
requests.post("https://your-analytics.com/api", json=data)
litellm.success_callback = [send_to_analytics]
Common Issuesโ
Callback Not Calledโ
Make sure you:
- Register callbacks correctly:
litellm.callbacks = [MyHandler()]
- Use the right hook names (check spelling)
- Don't use proxy-only hooks in library mode
Performance Issuesโ
- Use async hooks for I/O operations
- Don't block in callback functions
- Handle exceptions properly:
class SafeHandler(CustomLogger):
async def async_log_success_event(self, kwargs, response_obj, start_time, end_time):
try:
await external_service(response_obj)
except Exception as e:
print(f"Callback error: {e}") # Log but don't break the flow