Skip to main content
You can deploy, manage, and delete top-level app functions directly from the Buildfunctions Dashboard or from your code using the Buildfunctions SDK.

Functions vs. Sandboxes

It is important to understand the distinction between Functions and Sandboxes in the Buildfunctions ecosystem:
  • Functions (CPUFunction, GPUFunction): Orchestrate top-level application or agent logic.
  • Sandboxes (CPUSandbox, GPUSandbox): Execute untrusted and dynamic agent actions with full GPU access, automatic model mounting, built-in AI frameworks, runtime dependency installs, and more. They spin up instantly for isolated tasks and can run for up to 24 hours. This guide focuses on Functions—deploying and managing your top-level infrastructure.

Building functions

Below is a high-level overview of how to structure your Buildfunctions handler functions and their responses. The various examples demonstrate a consistent pattern: a main entry point function named handler that returns (or echoes) a response containing at least a body. Beyond that, you can optionally include a status code and headers in the natural syntax the language supports.
  1. The handler function:
  • Naming: Your main function must be named handler. This name is how Buildfunctions identifies which function to invoke when your code runs.
  • Purpose: The handler function is your function’s entry point. It contains the logic you want to run whenever your function is invoked. You can write other helper functions, but only the handler will be called automatically.
  1. The response format:
Your handler function must provide a response that can be returned to the caller. While the exact syntax differs by language, the structure is essentially the same:
body
string
required
The main content you want to return
statusCode
number
A numeric HTTP status code (e.g., 200 for success, 500 for a server error)
headers
object
Custom headers, such as Content-Type, can also be included
Cache-related functionality will be available soon

Example Response Structures

async function handler(event) {
  return {
    statusCode: 200,
    headers: { 'Content-Type': 'text/plain' },
    body: 'Hello, world!',
  };
}
Omit constructs like if __name__ == "__main__" in Python or handler() in JavaScript, as the platform will handle invocation.
function handler() {
  return {
    statusCode: 200,
    headers: { 'Content-Type': 'text/plain' },
    body: 'Hello, world!',
  };
}

// Do NOT include:
// handler(); // Unnecessary;

Supported Runtimes

Buildfunctions supports a variety of modern runtimes for your functions:
Runtime:
Python
Node.js
Deno
Go
Shell

CPU Functions

CPU functions are suitable for general-purpose workloads, orchestration, and lightweight tasks.

1. Function Code (Multi-Language Examples)

import os

def handler(event, context):
    # Retrieve the environment variable 'RANDOM_VAR'
    random_variable_value = os.getenv("RANDOM_VAR")
    print(f"The value of 'RANDOM_VAR' is: {random_variable_value}")

    # Construct the response body
    body = "Hello, world! To see your log, please refer to the logs page."

    # Return the response
    return {
        "statusCode": 200,
        "headers": {
            "Content-Type": "text/html; charset=utf-8"
        },
        "body": body
    }

2. Deploy via SDK

JavaScript
import { CPUFunction } from 'buildfunctions';

const cpuFunction = CPUFunction.create({
    name: "hello-world",
    language: "javascript",
    runtime: "node",
    memory: "512MB",
    code: "./handler.js"
});

await cpuFunction.deploy();

GPU Functions

Deploying a GPU function involves defining the configuration (hardware, runtime) and the code to be executed.

1. Function Code (Streaming Architecture)

This example demonstrates a streaming text generation function using transformers. It includes a requirements block to specify dependencies.
Python
"""
Name: streaming-text-generation

Python GPU Function for text generation with a streaming response using PyTorch and Transformers. Built and deployed on Buildfunctions.

requirements (add these within the Buildfunctions dashboard or via SDK)
transformers
accelerate

"""

import torch
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

# Global variables for caching
model = None
tokenizer = None
device = None

def initialize_model():
    global model, tokenizer, device
    try:
        if model is not None and tokenizer is not None:
            return

        device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        torch.backends.cudnn.benchmark = True
        
        model_path = "/mnt/storage/Llama-3.2-3B-Instruct-bnb-4bit"

        config = AutoConfig.from_pretrained(model_path)
        tokenizer = AutoTokenizer.from_pretrained(model_path)
        bnb_config = BitsAndBytesConfig(load_in_4bit=True)

        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            config=config,
            torch_dtype=torch.float16,
            device_map="auto",
            quantization_config=bnb_config if not hasattr(config, "quantization_config") else None,
        )
        model.to(device)

    except Exception as e:
        print(f"Error initializing the model: {e}")
        raise RuntimeError("Failed to initialize the model.")

async def stream_tokens(prompt):
    try:
        initialize_model()
        input_ids = tokenizer(prompt, return_tensors="pt", max_length=512, truncation=True).input_ids.to(device)

        yield b"<<START_STREAM>>\n"

        with torch.no_grad():
            past_key_values = None
            generated_ids = input_ids

            for _ in range(200):
                try:
                    outputs = model(
                        input_ids=generated_ids,
                        past_key_values=past_key_values,
                        use_cache=True,
                    )
                    logits = outputs.logits[:, -1, :]
                    
                    # Simple sampling
                    next_token_id = torch.argmax(logits, dim=-1, keepdim=True)

                    past_key_values = outputs.past_key_values
                    generated_ids = next_token_id

                    token = tokenizer.decode(next_token_id.squeeze(), skip_special_tokens=True)
                    yield f"<<STREAM_CHUNK>>{token}<<END_STREAM_CHUNK>>\n".encode()

                    if next_token_id.squeeze().item() == tokenizer.eos_token_id:
                        break

                except Exception as gen_error:
                    print(f"Error during token generation: {gen_error}")
                    break

        yield b"<<END_STREAM>>\n"

    except Exception as e:
        print(f"Error in streaming tokens: {e}")
        yield b"<<STREAM_ERROR>>\n"

async def async_stream_wrapper(prompt):
    async for chunk in stream_tokens(prompt):
        yield chunk

def handler():
    try:
        prompt = "Tell me about the most mysterious phenomena in the universe."
        return {
            "statusCode": 200,
            "headers": {
                "Content-Type": "text/event-stream",
                "Cache-Control": "no-cache",
                "Access-Control-Allow-Origin": "*",
            },
            "body": async_stream_wrapper(prompt),
        }
    except Exception as e:
        return {"statusCode": 500, "body": {"error": "Internal Server Error"}}

2. Deploy via SDK

Use GPUFunction.create to deploy. You can specify advanced configuration like gpu type, cpu count, and timeout.
JavaScript
import { GPUFunction } from 'buildfunctions';

const gpuSandbox = GPUFunction.create({
    name: "streaming-text-gen",
    language: "python",
    gpu: "T4",
    memory: "2GB",
    timeout: 180,
    code: "./streaming_function.py" // Path to your file
});

console.log("Deploying function...");
const deployed = await gpuSandbox.deploy();

Function Configuration

When creating functions, you can configure the following resources:
  • gpu: GPU type (e.g., T4).
  • memory: RAM allocation (e.g., 512MB, 2GB, 16GB).
  • timeout: Execution timeout in seconds.
  • runtime: Execution environment (e.g., deno, node).
For specific runtimes, you can include dependency instructions:
  • Python: Add a requirements.txt block in your code comments or string.
  • Deno: Add run arguments like deno run --allow-ffi.

Function Management

Find and Delete

You can search for and delete functions using the main client.
JavaScript
import { Buildfunctions } from 'buildfunctions';

async function manageFunction() {
    const apiToken = process.env.BUILDFUNCTIONS_API_KEY;
    const buildfunctions = await Buildfunctions({ apiToken });
    
    const functionName = "streaming-text-gen";

    try {
        console.log(`🔍 Searching for function: ${functionName}`);
        
        // Find existing function
        const targetFunction = await buildfunctions.functions.findUnique({
            where: { name: functionName }
        });

        if (targetFunction) {
            console.log(`Found: ${targetFunction.id}`);
            
            // Delete function
            console.log(`🗑️ Deleting function...`);
            await targetFunction.delete();
            console.log('✅ Function deleted successfully');
        } else {
            console.log('ℹ️ Function not found');
        }
    } catch (error) {
        console.error('❌ Operation failed:', error.message);
    }
}

Advanced Usage: Nested Orchestration

A powerful pattern in Buildfunctions is using a top-level Function to orchestrate nested Sandboxes. This allows you to combine persistent endpoints with ephemeral, high-performance compute.

Example: GPU Function spawning nested Sandboxes

This example demonstrates a Node.js GPU Function that spins up both a CPU sandbox (for text analysis) and a GPU sandbox (for Python inference).
JavaScript
import { Buildfunctions, GPUSandbox, CPUSandbox } from 'buildfunctions';

export default async function handler(req, res) {
  const buildfunctions = await Buildfunctions({ apiToken: process.env.BUILDFUNCTIONS_API_KEY });

  const [cpuSandbox, gpuSandbox] = await Promise.all([
    CPUSandbox.create({ name: 'text-analyzer', runtime: 'node' }),
    GPUSandbox.create({ name: 'model-inference', language: 'python', gpu: 'T4' })
  ]);

  try {
    // 2. Execute tasks in parallel
    const [analysis, prediction] = await Promise.all([
      cpuSandbox.run(`...`), // CPU task
      gpuSandbox.run(`...`)  // GPU task
    ]);

    return {
      statusCode: 200,
      body: JSON.stringify({ analysis: analysis.stdout, prediction: prediction.stdout })
    };
  } finally {
    // 3. Cleanup
    await Promise.all([cpuSandbox.delete(), gpuSandbox.delete()]);
  }
}