A simple example of CF Workers AI

Introduction#

It is said that Cloudflare is the cyber Buddha and a truly charitable enterprise that promotes serverless. This statement has been confirmed once again after the release of Workers AI. Building AI applications and running AI models has never been easier. Now, let's build a simple example to see how low and simple the development process is for building an LLM based on CF Workers AI. 🤯

Preparations#

A Cloudflare account
A Python development environment
- The requests library needs to be installed
- Optional installation of jsonpath

Tutorial#

Get Cloudflare Workers AI API Token#

Similar to other APIs, you need to obtain an API Token first for billing and identity verification purposes.

Go to the Cloudflare dashboard, log in to your Cloudflare account, then click on the AI tab in the sidebar to enter the Workers AI main interface. Then, select Use REST API -> Get API Token in order. You will be taken to a new page. You don't need to do anything, just scroll to the bottom and click Continue to show summary to create the token. Then, copy the generated token and save it properly.

Please note that the generated token will only be displayed once, and holding the token will have direct access to your Workers AI resources, so please save it in a secure place.

Initial Code#

Modify the Model and Token#

Return to the "Use Workers AI REST API" page and select the third option python in the code area below. It should look similar to the following:

import requests

API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{id}/ai/run/"
headers = {"Authorization": "Bearer {API_TOKEN}"}
# My account id has been hidden, it should be filled in with {id}

def run(model, inputs):
    input = { "messages": inputs }
    response = requests.post(f"{API_BASE_URL}{model}", headers=headers, json=input)
    return response.json()

inputs = [
    { "role": "system", "content": "You are a friendly assistant that helps write stories" },
    { "role": "user", "content": "Write a short story about a llama that goes on a journey to find an orange cloud "}
];
output = run("@cf/meta/llama-2-7b-chat-int8", inputs)
print(output)

Copy the code, open the prepared Python environment, and paste it in. Next, we need to change the model to a more suitable one for Chinese users, qwen1.5-7b-chat-awq. Find the following line of code in the code:

output = run("@cf/meta/llama-2-7b-chat-int8", inputs)

Change @cf/meta/llama-2-7b-chat-int8 to @cf/qwen/qwen1.5-7b-chat-awq. Then, find the following line of code:

headers = {"Authorization": "Bearer {API_TOKEN}"}

Replace {API_TOKEN} with the API Token you just generated. Now, let's try running the program. If everything is fine, the program should run smoothly and return the corresponding result.

Modify the Prompt#

Next, we need to modify the system prompt and the question to serve as the prompt for the AI. Let's find the following lines of code in the program:

inputs = [
    { "role": "system", "content": "You are a friendly assistant that helps write stories" },
    { "role": "user", "content": "Write a short story about a llama that goes on a journey to find an orange cloud "}
];

The content after "role": "system", corresponds to the system prompt in LLM. Change it to "你是一位友好的助手，帮助写故事" (you can modify it as needed). The content after "role": "user", is the question you ask the AI. You can modify it to whatever you want, for example, "Hello, please introduce yourself".

At this point, you have actually completed a basic example. It's that simple. Next, we will implement user input and output optimization.

Advanced Code#

Implement User Input#

This is an easy part. Just make a slight modification to the inputs variable. Here's the modified code that directly displays the result:

userinput = input("请输入要提的问题：")  
inputs = [  
    { "role": "system", "content": "你是一位优秀的中文助手" },  
    { "role": "user", "content": userinput}  
];  
output = run("@cf/qwen/qwen1.5-7b-chat-awq", inputs)

Here, we changed the fixed prompt to userinput, which reads user input. This way, we have implemented user input functionality.

Output Optimization#

If you pay attention, you will notice that the current output is quite ugly, it is a raw JSON response, like this:

{'result': {'response': '你好！很高兴能为你提供帮助。有什么问题或需要帮助的吗？'}, 'success': True, 'errors': [], 'messages': []}

What we actually need is the content in response. To extract response, we need to use the optional jsonpath library. Add the following line of code below output:

final = jsonpath(output,"$..response")[0]

Then, change print(output) to print(final).

Now, your output will be a clean "你好！很高兴能为你提供帮助。有什么问题或需要帮助的吗？".

Going Further?#

Actually, up to this point, as a one-round conversation AI, it is already quite complete. However, we can go further and implement multi-round conversation functionality. Please refer to the code for the specific implementation process. I won't explain it further here.

Final Code#

import requests  
from jsonpath import jsonpath  
  
info = ["你是一位优秀的中文聊天助手，前文的所有聊天为："]  
API_BASE_URL = "https://api.cloudflare.com/client/v4/accounts/{id}/ai/run/"  
# Replace with your user id
headers = {"Authorization": "Bearer {API_TOKEN}"}  
# Replace with your API_TOKEN
  
  
def run(model, inputs):  
    input = { "messages": inputs }  
    response = requests.post(f"{API_BASE_URL}{model}", headers=headers, json=input)  
    return response.json()  
  
userinput = input("请输入要提的问题：")  
waittoaddU = "用户提问：" + userinput  
info.append(waittoaddU)  
inputs = [  
    { "role": "system", "content": "你是一位优秀的中文助手" },  
    { "role": "user", "content": userinput}  
];  
output = run("@cf/qwen/qwen1.5-7b-chat-awq", inputs)  
final = jsonpath(output,"$..response")[0]  
waittoaddA = "系统回答：" + final  
info.append(waittoaddA)  
print(final) 
  
while True:  
    userinput = input("请输入要提的问题：")  
    if userinput == "EXIT":  
        break  
    inputs = [  
        { "role": "system", "content": "\n".join(info) },  
        { "role": "user", "content": userinput}  
    ];  
    output = run("@cf/qwen/qwen1.5-7b-chat-awq", inputs)  
    waittoaddU = "用户提问：" + userinput  
    info.append(waittoaddU)  
    final = jsonpath(output,"$..response")[0]  
    waittoaddA = "系统回答：" + final  
    info.append(waittoaddA)  
    print(final)

Thank you for reading