How to Add a Firecrawl Function to Your Assistant

Firecrawl is a powerful tool that transforms entire websites into clean, LLM-ready markdown or structured data, making it ideal for AI applications that need to interact with web content.
With the Firecrawl function in Superinterface, your assistant can retrieve data by scraping specific pages, crawling sites, extracting targeted information, or performing web searches.
This guide will walk you through the process of setting up a Firecrawl function, enabling your assistant to interact with the web and extract valuable information.
Extract clean markdown from websites with Firecrawl.
Extract clean markdown from websites with Firecrawl.

Step One: Create a New Function

    Go to Assistants in the menu, and then create a new assistant or select an existing one.
    Navigate to the Functions tab within the assistant and click on New Function.
    In the New Function screen, choose Firecrawl as the response type to set up your Firecrawl function.
Create new Firecrawl function.

Step Two: Choose the Type of Firecrawl Action

After selecting Firecrawl, you’ll need to choose the action type. Each type is designed for different purposes:

1. Scrape

What It Does: The Scrape action lets your assistant visit a specific URL and retrieve all the content from that page.
Use Case: Ideal for gathering complete information from a single web page, such as collecting the text from a blog article.

2. Crawl

What It Does: The Crawl action not only retrieves content from the target URL but also follows links on that page, extracting content from linked sub-pages.
Use Case: Useful when you need data from multiple connected pages, like pulling information from all product pages within a category.

3. Extract

What It Does: Extract allows you to specify a prompt describing exactly what content you want to retrieve from a URL. This is processed by a language model (LLM) to extract only the specified data.
Use Case: Best for when you need specific elements from a page, such as headlines, price details, or contact information, without pulling everything.

4. Search

What It Does: Search enables the assistant to perform a search engine request (SERP). Firecrawl will return the top search results for the query, based on your criteria.
Use Case: Useful for general information searches, like finding recent articles or resources related to a particular topic.
Select action type.

Step Three: Configure the OpenAPI Function Specification

Pre-filled Function Specs

When you select an action type (Scrape, Crawl, Extract, or Search), the OpenAPI function specification at the top of the setup page is prefilled with a template automatically. This helps you get started quickly by providing a basic structure, which you can customize further by adjusting or adding parameters.
A pre-filled OpenaAPI function specification for a Firecrawl scrape function.
A pre-filled OpenaAPI function specification for a Firecrawl scrape function.

Adding Parameters

Customize Your Requests: You can add extra parameters to the function. These parameters will be merged with the request sent by Firecrawl.
Body Customization: If you include additional data in the Body field, it will be combined with the default parameters from the OpenAPI function specification, giving you flexibility over how data is processed.
Tip: Refer to the Firecrawl Documentation to see which parameters and options are available for each type of action. This will help you customize the function to meet your specific needs.

Step Four: Enter Your Firecrawl API Key

Authentication

To use Firecrawl, your assistant needs to authenticate with the Firecrawl service, which requires an API key. Here’s how to get it:
    Create a Firecrawl account and sign in.
    Go to your Dashboard.
    Click on API Keys in the left menu.
    Create a new API key and copy it.
Create your Firecrawl API key.
Once you have your Firecrawl API key:
    Return to your function setup in Superinterface.
    Paste the API key into the API Key field to enable authentication for your assistant’s requests. This ensures secure communication with Firecrawl.

Example: Setting Up Firecrawl to Retrieve Latest News from The Guardian

Let’s walk through an example using Firecrawl to get the latest news articles from The Guardian. This example will show how you can configure Firecrawl to scrape web content directly from the homepage through your assistant.

Creating the Scrape Function

Before configuring the assistant, you’ll need to create a new Firecrawl function in the Functions tab:
    Go to the Functions tab of your assistant and click on New Function.
    Select Firecrawl as the response type and choose Scrape as the action.
    Insert your Firecrawl API key to authenticate the function.
    Keep the prefilled OpenAPI function specification as it is. Make sure the function is named scrape, as you will need to refer to this name in the assistant instructions.

Setting up the Assistant Instructions

Example instructions for the Guardian Reading assistant
Example instructions for the Guardian Reading assistant
When writing the assistant instructions, it is essential to specify both the function to be used and the URL to be scraped. In this example, the assistant is specifically designed to retrieve articles from The Guardian, so you would include the URL directly in the instructions.
However, if you want the assistant to allow users to specify a URL during the conversation, you can write more flexible instructions where the URL is not hardcoded but provided by the end user.

Explanation:

In this example, Firecrawl scrapes content from the homepage of The Guardian, retrieving the latest news articles. By instructing the assistant to call the function with the Scrape method, Firecrawl visits the URL and extracts the page content, providing users with the most recent headlines and summaries available on the site.
The assistant can be configured to present this information in a user-friendly format, adding commentary or highlighting specific articles. For example, you can instruct it to display just two stories at a time and suggest follow-up queries to keep users engaged. This ensures your assistant is not just delivering raw data but curating a helpful and interactive experience for users.
Assistant using Firecrawl to scrape The Guardian
By setting up Firecrawl this way, you can keep your assistant consistently updated with the latest news from The Guardian or any other preferred news source. This approach can be adapted to fetch information from multiple websites, enabling your assistant to deliver comprehensive and timely content.

Ready to Publish?

You did it! Your assistant is now fully set up to retrieve and display web content using Firecrawl. If you're satisfied with your configuration, it's time to integrate your assistant with your website.
Simply click Publish to make it live. For a detailed guide on how to publish your assistant, visit Publish Interface.