Skip to main content

· 3 min read
Katie Schilling

We’re transitioning to virtual hosted style URLs for all new buckets created after February 19, 2025. For new buckets, we will stop supporting path style URLs. Buckets created before February 19, 2025 will continue to work with either path style or virtual host style URLs.

The path style URL looks like this: https://fly.storage.tigris.dev/tigris-example/bar.txt

The virtual host style URL looks like this: https://tigris-example.fly.storage.tigris.dev/bar.txt

With the path style URL, the subdomain is always fly.storage.tigris.dev. By moving to virtual host style URLs, the subdomain is specific to the bucket. This additional specificity allows us to make some key improvements for security and scalability.

Why make this change now?

Recently some ISPs blocked the Tigris subdomain after malicious content was briefly shared using our platform. Though we removed the malicious content, the subdomain was the common denominator across several reports and added to blocklist maintained by security vendors. This block of our domain resulted in failed downloads on several ISPs with unclear error messages. Either the DNS resolved to another IP not owned by Tigris, or there were connection errors that implied a network issue. We’re sure this was frustrating for folks to debug.

We have been working with the security vendors to remove our domain from their blocklists. However, the long term solution is to move to virtual hosted style URLs so that the subdomains are no longer the common denominator when identifying content.

How does this impact your code?

You’ll need to update your code anywhere you have path based access like for presigned URLs. You’ll also need to configure your S3 client libraries to use the virtual hosted style URL. Some examples are below. If we’ve missed your framework, please reach out, and we’ll help.

svc = boto3.client(
's3',
endpoint_url='https://fly.storage.tigris.dev',
config=Config(s3={'addressing_style': 'virtual'}),
)

With this move to virtual hosted style URLs, we’re undoubtedly going to break some existing workflows as new buckets are created. If this creates a hardship on you, please contact us at help@tigrisdata.com and we'll find a solution.

Want to try Tigris?

Make a bucket and store your models, training data, and artifacts across the globe! No egress fees.

· 9 min read
Xe Iaso

A cartoon tiger desperately runs away from a datacentre fire

A cartoon tiger desperately runs away from a datacentre fire. Image generated using Flux [pro].

The software ecosystem is built on a bedrock of implicit trust. We trust the software won’t have deliberately placed security vulnerabilities and won’t be yanked away offline without warning. AI models aren’t exactly software, but they’re distributed using a lot of the same platforms and technology as software. Thus, people assume they’re distributed using the same social contract as with software.

The AI ecosystem has a lot of the same distribution and trust challenges as software ecosystems do, but with much larger blobs of data that are harder to introspect. There are fears that something bad is going to happen with some large model and create a splash even greater than the infamous left-pad incident of 2016. These kinds of attacks seem unthinkable, but are inevitable.

How can you defend against AI supply-chain attacks? What are the risks? Today I’m going to cover what we can learn from the left-pad incident and how making a copy of the models you depend on can make your products more resilient.

· 19 min read
Xe Iaso

A majestic blue tiger surfing on the back of a killer whale. The image evokes Ukiyo-E style framing.

A majestic blue tiger surfing on the back of a killer whale. The image evokes Ukiyo-E style framing. Image generated using Flux [pro].

DeepSeek R1 is a mixture of experts reasoning frontier AI model; it was released by DeepSeek on January 20th, 2025. Along with the model being available by DeepSeek's API, they released the model weights on HuggingFace and a paper about how they got it working.

DeepSeek R1 is a Mixture of Experts model. This means that instead of all of the model weights being trained and used at the same time, the model is broken up into 256 "experts" that each handle different aspects of the response. This doesn't mean that one "expert" is best at philosophy, music, or other subjects; in practice one expert will end up specializing with the special tokens (begin message, end message, role of interlocutor, etc), another will specialize on punctuation, some will focus on visual description words or verbs, and some can even focus on proper names or numbers. The main advantage of a Mixture of Experts model is that it allows you to get much better results with much less compute spent in training and at inference time. There are some minor difficulties involved in making sure that tokens get spread out between the experts in training, but it works out in the end.

· One min read
Xe Iaso

A bunch of wrenches on a tool rack.

Recently Amazon made changes to the S3 libraries that broke Tigris support. We have made fixes on our end and you can upgrade to the latest releases of the AWS CLI, AWS SDK for Python (boto3), AWS SDK for JavaScript, AWS SDK for Java and AWS SDK for PHP.

If you are running into any issues with these updated SDK releases, please reach out via Bluesky, LinkedIn, or X (formerly Twitter).

· 8 min read
Xe Iaso

A majestic blue tiger riding on a sailing ship. The tiger is very large.

A majestic blue tiger riding on a sailing ship. The tiger is very large. Image generated using PonyXL.

AI models can get pretty darn large. Larger models seem to perform better than smaller models, but we don’t quite know why. My work MacBook has 64 gigabytes of RAM and I’m able to use nearly all of it when I do AI inference. Somehow these 40+ gigabyte blobs of floating point numbers are able to take a question about the color of the sky and spit out an answer. At some level this is a miracle of technology, but how does it work?

Today I’m going to cover what an AI model really is and the parts that make it up. I’m not going to cover the linear algebra at play nor any of the neural networks. Most people want to start with an off the shelf model, anyway.

· 4 min read
Xe Iaso

Hey all. Recently AWS released boto3 version 1.36.0, and in the process they changed how the upload_file call works. This will cause uploads to Tigris with boto3 version 1.36.0 or higher to fail with the following error message:

boto3.exceptions.S3UploadFailedError: Failed to upload ./filename.jpg to mybucket/filename.jpg: An error occurred (MissingContentLength) when calling the PutObject operation: You must provide the Content-Length HTTP header.

In order to work around this, downgrade boto3 to the last release of version 1.35.x:

pip install boto3==1.35.95

Make sure that you persist this in your requirements.txt, pyproject.toml, or whatever you use to do dependency management.

You might also hit this with the JavaScript client at version v3.729.0 or later. In order to fix that, downgrade to version v3.728.0:

npm install @aws-sdk/client-s3@3.728.0
npm install @aws-sdk/s3-request-presigner@3.728.0

Make sure the changes are saved in your package.json file.

We’re fixing this on our end, but we want to take a minute to clarify why this is happening and what it means for Tigris to be S3 compatible.

What does it mean to be S3 compatible?

At some level, an API is just a set of calls that have listed side effects. You upload an object by name and later are able to get that object back when you give the name. The devil is in the details, and like any good API there are a lot of details.

In a perfect world, when you switch to using Tigris, you drop Tigris into place and then you don’t need to think anymore. We don’t live in a perfect world, and as such Tigris has a list of compatible API calls, and if your app only uses those calls you’ll be fine. Most apps are perfectly happy with that set of calls (in fact only use about 5 of them at most). We are adding support for any missing calls as reality demands and time allows. Our goal is that there’s no breaking changes when anything else gets released. Client or server.

S3’s API was originally meant to be used with Amazon S3. It has since become a cross-cloud standard, any cloud you can think of likely has a S3-compatible object storage system. It’s become the POSIX abstraction for the cloud. Any changes to the API change a whole host of edge cases that the creators of S3 probably don’t have in mind.

Tigris, Digital Ocean, MinIO, R2, and others were all affected by this change. We found out about this breakage when one of our tests broke in a new and exciting way that confused us. From what we can tell, users of boto3 and the JavaScript client found out about this change by their production code breaking without warning. Even some of AWS’ own example code broke with this change.

I feel bad for the team behind the S3 API changes, they’re probably not getting very much love from the developer community right now. If this was an outage, I’d say #hugops. I’m not sure what to say this time other than I hope that this post helps you make your code work again.

We’re taking this incident seriously and are updating our testing practices to make sure that we have more advance warning should this happen again as we take S3 compatibility seriously.

We’re updating Tigris so that developers can use this new version of the S3 client. We’ll have that rolled out soon. Follow us on Bluesky @tigrisdata.com or on LinkedIn to keep up to date!

Want to try it out?

Make a global bucket with no egress fees and use it with Python or JavaScript.

· 10 min read
Xe Iaso

Earlier this year I started consolidating some workloads to my homelab Kubernetes cluster. One of the last ones was a doozy. It's not a compute-hard or a memory-hard workload, it's a storage-hard workload. I needed to move the DJ set recording bot for an online radio station off of its current cloud and onto my homelab, but I still wanted the benefits of the cloud such as no thinking remote backups.

This bot has been running for a decade and the dataset well predates that, over 675 Gi of DJ sets, including ones that were thought to be lost media. Each of these sets is a 320 KiB/sec MP3 file that is anywhere from 150 to 500 MB, with smaller text files alongside them.

Needless to say, this dataset is very important to me. The community behind this radio station is how I've met some of my closest friends. I want to make sure that it's backed up and available for anyone that wants to go back and listen to the older sets. I want to preserve these files and not just dump them in an Archive bucket or something that would make it hard or slow to access them. I want these to be easily accessible to help preserve the work that goes into live music performances.

Here's how I did it and made it easy with Tigris.

An extreme close-up of a tiger with blue and orange fur. The Kubernetes logo replaces its iris.

An extreme close-up of a tiger with blue and orange fur. The Kubernetes logo replaces its iris.

· 6 min read
Katie Schilling

What do you do when you need to serve up a completely custom, 7+ billion parameter model with sub 10 second cold start times? And without writing a Dockerfile or managing scaling policies yourself. It sounds impossible, but Beam's serverless GPU platform provides performant, scalable AI infrastructure with minimal configuration. Your code already does the AI inference in a function. Just add a decorator to get that function running somewhere in the cloud with whatever GPU you specify. It turns on when you need it, it turns off when you don't. This can save you orders of magnitude over running a persistent GPU in the cloud.

Tigris tiger watching a beam from a ground satellite. Image generated with Flux [dev] from Black Forest Labs on fal.ai

Tigris tiger watching a beam from a ground satellite. Image generated with Flux [dev] from Black Forest Labs on fal.ai.

· 21 min read
Xe Iaso

When you get started with finetuning AI models, you typically pull the datasets and models from somewhere like the Hugging Face Hub. This is generally fine, but as your usecase grows and gets more complicated, you're going to run into two big risks:

  • You're going to depend on the things that are critical to your business being hosted by someone else on a platform that doesn't have a public SLA (Service-Level Agreement, or commitment to uptime with financial penalties when it is violated).
  • Your dataset will grow beyond what you can fit into ram (or even your hard disk), and you'll have to start sharding it into chunks that are smaller than ram.

Most of the stuff you'll find online deals with the "happy path" of training AI models, but the real world is not quite as kind as this happy path is. Your data will be bigger than ram. You will end up needing to make your own copies of datasets and models because they will be taken offline without warning. You will need to be able to move your work between providers because price hikes will happen.

The unfortunate part is that this is the place where you're left to figure it out on your own. Let's break down how to do larger scale model training in the real world with a flow that can expand to any dataset, model, or cloud provider with minimal changes required. We're going to show you how to use Tigris to store your datasets and models, and how to use SkyPilot to abstract away the compute layer so that you can focus on the actual work of training models. This will help you reduce the risk involved with training AI models on custom datasets by importing those datasets and models once, and then always using that copy for training and inference.

A blue tiger surfs the internet waves, object storage in tow. The image has an ukiyo-e style with flat pastel colors and thick outlines.

A blue tiger surfs the internet waves, object storage in tow. The image has an ukiyo-e style with flat pastel colors and thick outlines.

Details

Generation details Generated using Counterfeit v3.0 using a ComfyUI flow stacking several LoRA adapters as well as four rounds of upscaling and denoising. Originally a sketch by Xe Iaso.

· 19 min read
Xe Iaso

A nomadic server hunting down wild GPUs in order to save money on its cloud computing bill. Image generated with Flux [dev] from Black Forest Labs on fal.ai

A nomadic server hunting down wild GPUs in order to save money on its cloud computing bill. Image generated with Flux [dev] from Black Forest Labs on fal.ai.

Taco Bell is a miracle of food preparation. They manage to have a menu of dozens of items that all boil down to permutations of 8 basic items: meat, cheese, beans, vegetables, bread, and sauces. Those basic fundamentals are combined in new and interesting ways to give you the crunchwrap, the chalupa, the doritos locos tacos, and more. Just add hot water and they’re ready to eat.

Even though the results are exciting, the ingredients for them are not. They’re all really simple things. The best designed production systems I’ve ever used take the same basic idea: build exciting things out of boring components that are well understood across all facets of the industry (eg: S3, Postgres, HTTP, JSON, YAML, etc.). This adds up to your pitch deck aiming at disrupting the industry-disrupting industry.

A bunch of companies want to sell you inference time for your AI workloads or the results of them inferencing AI workloads for you, but nobody really tells you how to make this yourself. That’s the special Mexican Pizza sauce that you can’t replicate at home no matter how much you want to be able to.

Today, we’ll cover how you, a random nerd that likes reading architectural articles, should design a production-ready AI system so that you can maximize effectiveness per dollar, reduce dependency lock-in, and separate concerns down to their cores. Buckle up, it’s gonna be a ride.