Skip to main content

6 posts tagged with "ai"

View All Tags

· 6 min read
Katie Schilling

What do you do when you need to serve up a completely custom, 7+ billion parameter model with sub 10 second cold start times? And without writing a Dockerfile or managing scaling policies yourself. It sounds impossible, but Beam's serverless GPU platform provides performant, scalable AI infrastructure with minimal configuration. Your code already does the AI inference in a function. Just add a decorator to get that function running somewhere in the cloud with whatever GPU you specify. It turns on when you need it, it turns off when you don't. This can save you orders of magnitude over running a persistent GPU in the cloud.

Tigris tiger watching a beam from a ground satellite. Image generated with Flux [dev] from Black Forest Labs on fal.ai

Tigris tiger watching a beam from a ground satellite. Image generated with Flux [dev] from Black Forest Labs on fal.ai.

· 21 min read
Xe Iaso

When you get started with finetuning AI models, you typically pull the datasets and models from somewhere like the Hugging Face Hub. This is generally fine, but as your usecase grows and gets more complicated, you're going to run into two big risks:

  • You're going to depend on the things that are critical to your business being hosted by someone else on a platform that doesn't have a public SLA (Service-Level Agreement, or commitment to uptime with financial penalties when it is violated).
  • Your dataset will grow beyond what you can fit into ram (or even your hard disk), and you'll have to start sharding it into chunks that are smaller than ram.

Most of the stuff you'll find online deals with the "happy path" of training AI models, but the real world is not quite as kind as this happy path is. Your data will be bigger than ram. You will end up needing to make your own copies of datasets and models because they will be taken offline without warning. You will need to be able to move your work between providers because price hikes will happen.

The unfortunate part is that this is the place where you're left to figure it out on your own. Let's break down how to do larger scale model training in the real world with a flow that can expand to any dataset, model, or cloud provider with minimal changes required. We're going to show you how to use Tigris to store your datasets and models, and how to use SkyPilot to abstract away the compute layer so that you can focus on the actual work of training models. This will help you reduce the risk involved with training AI models on custom datasets by importing those datasets and models once, and then always using that copy for training and inference.

A blue tiger surfs the internet waves, object storage in tow. The image has an ukiyo-e style with flat pastel colors and thick outlines.

A blue tiger surfs the internet waves, object storage in tow. The image has an ukiyo-e style with flat pastel colors and thick outlines.

Details

Generation details Generated using Counterfeit v3.0 using a ComfyUI flow stacking several LoRA adapters as well as four rounds of upscaling and denoising. Originally a sketch by Xe Iaso.

· 5 min read
Xe Iaso

Docker is the universal package format of the internet. When you deploy software to your computers, chances are you build your app into a container image and deploy it through either Docker or something that understands the same formats that Docker uses. However, this is where they get you: Docker image storage in the cloud is not free. Docker registries also have strict image size limits and will charge you egress fees based on the size of your images.

What if you could host your own registry though? What if when doing it you could actually get a better experience than you get with the hosted registries on the big cloud.

A sea of scattered clouds covers the land beneath.

A sea of scattered clouds covers the land beneath. Photo by Xe Iaso, iPhone 15 Pro Max @ 22mm.

· 6 min read
Jesse Thompson

If you've been toying around in the AI space over the past few months, you've probably heard of Ollama. Ollama is a tool for running various LLMs locally on your own hardware, and currently supports a bunch of open models from Google, Facebook and independent sources.

Besides the basic terminal chat function, Ollama has an API for use from within your favourite programming languages. This means you can build your very own LLM-powered apps!

Let's say we've built the next killer LLM app: ChatWich (which allows you to chat with your sandwich) and people are loving it when you show it off on your laptop, but personally visiting all your customers with your computer in hand is getting tiring, and the travel bills are starting to outweigh the (awesome) frequent flyer miles you're getting.

It's time to move to the cloud.

· 9 min read
Brian Morrison II

Generative AI is a fantastic tool to use to quickly create images based on prompts.

One of the issues with some of these platforms is that they don’t actually store the images in a way that makes them easier to retrieve after they’ve been created. Oftentimes you have to make sure to save it immediately after the process is completed, otherwise, it's gone. Luckily, Stability offers an API that can be used to programmatically generate images, and Tigris is the perfect solution to store those images for retrieval.

In this article, you’ll learn how to deploy an app to Fly.io that will allow you to generate images using the Stability API and automatically store them in a Tigris bucket.