Skip to main content

8 posts tagged with "s3"

View All Tags

· 5 min read
Katie Schilling

A library with a fractal of bookshelves in all directions, wooden ladders connecting the floor to the shelves. Many blue tigers tend to the books. — Image generated with Flux [pro] 1.1 from Black Forest Labs on fal.ai

A library with a fractal of bookshelves in all directions, wooden ladders connecting the floor to the shelves. Many blue tigers tend to the books. — Image generated with Flux [pro] 1.1 from Black Forest Labs on fal.ai

When you have a lot of data, maybe even Big Data ™️, you might start to wonder why you’re paying so much to keep it all hot and ready. Do you really need that prior version of your model weights from last year to be available instantly? Let’s be clear though: we’re happy to serve you petabytes of old model weights and datasets… but we’d rather help you save some money on your infrastructure budget.

When you create new objects or buckets, you can select the storage tier to put it in: Standard, Infrequent Access, or Archive. Everything you currently have in Tigris is likely in the Standard storage tier, and when you create new objects with the S3 API and don’t specify a storage tier, it’ll end up in Standard too.

We’ve updated our pricing with specifics, but you can expect to save $0.016 per GB per month by moving your backups and other old data from the Standard storage tier to the Archive storage tier. If you want to store one terabyte of data in the Archive tier, it will cost you $4 (at time of writing). At Infrequent Access rates, that will cost you $10, and at Standard, it’ll cost you $20 per month. This is a 5x cost reduction for data that you don’t need often and can tolerate waiting an hour or so for it to be pulled out of Archive.

And, of course, none of our Storage Tiers include egress fees.

Want to try it out?

Make a global bucket with no egress fees

Deciding what tier to use

I’m sure you’ve heard of folks regretting their decision to archive data that they end up needing in a hurry. Here’s a good rule of thumb to decide where objects should go: how much downtime can you tolerate when everything’s on fire and you need that data NOW?

If you can tolerate an hour of downtime for that data to get restored from Archive, Archive is fine. If you can’t, Infrequent Access is probably the best bet: Tigris returns Infrequent Access objects as rapidly as Standard tier objects.

Your database backups from 3 years ago or the shared drive from a long-completed project are probably not going to be accessed very often (maybe even never), so it makes sense to Archive them just in case. Your database backups from about 10 minutes ago are much more likely to be accessed, so it makes sense to put them into Infrequent Access. That way you can respond instantly to the wrong database being deleted instead of having to wait for an hour for the backups to load from Archive.

Here’s how to use object tiers

When you create a new bucket in the Tigris Dashboard, you can select which storage tier you want objects to use by default:

The Tigris Dashboard showing storage tier selection with three options: Standard, Infrequent Access, and Archive

A screenshot of the Tigris Dashboard showing the default storage tier selector for a newly created bucket. The options are: Standard, Infrequent Access, and Archive.

Choose between:

  • Standard: the default storage class, it provides high durability, availability, and performance for frequently accessed data.
  • Infrequent access: Lower-cost storage for data that isn’t accessed frequently, but requires rapid access when needed.
  • Archive: Low-cost storage for data archiving. Long-term data archiving with infrequent access.

Otherwise, you can set it when you upload a file:

Standard
aws s3 cp --storage-class STANDARD hello.txt s3://your-bucket-name/your-object-name

Infrequent Access
aws s3 cp --storage-class STANDARD_IA hello.txt s3://your-bucket-name/your-object-name

Archive
aws s3 cp --storage-class GLACIER hello.txt s3://your-bucket-name/your-object-name

What’s up next

I bet you're thinking, Wow this would be really cool to use with a Lifecycle Rule feature so I can better manage my backups and older objects. Us, too! Lifecycle Rules are coming soon.

Convinced? Make a new bucket today and give Tigris a try.

· 5 min read
Garren

Autumn trees on a dusty road in Magoebaskloof, South Africa

Autumn trees on a dusty road in Magoebaskloof, South Africa. Photo by Garren Smith, iPhone 13 Pro.

Tigris now supports object notifications! Object notifications are how you receive events every time something changes in a bucket. Think of it as your bucket's way of saying "Hey, something happened! Come check it out!", much like the inotify subsystem in Linux. These notifications can be helpful for keeping track of what's going on in your application.

Use Case: Automatic Image Processing

Imagine you're building a photo-sharing app. Every time a user uploads a new picture, you want to automatically generate a thumbnail and maybe even run it through an AI to detect any inappropriate content. With object notifications, this becomes a breeze!

  1. User uploads an image to your Tigris bucket.
  2. Tigris sends a notification to your webhook.
  3. Your server receives the notification and springs into action.
  4. It downloads the new image, creates a thumbnail, and runs it through an AI check.
  5. The processed image and its metadata are saved back to Tigris.

All of this happens automatically, triggered by that initial upload.

Behind the Scenes: Building Object Notifications

Now, let's pull back the curtain and see how we built this feature and a few tricky situations we had to handle. Grab your hard hat, because we're going on a little tour of Tigris's inner workings!

Tigris isn't just any object store – it's a global object store. This means that objects can be changed in multiple regions around the world. This makes them available in multiple regions, always ready when you need them. But means we need a way of keeping track of all the changes for the same object. This is where replication comes in.

Replication: Keeping Everyone in the Loop

To make sure everything stays in sync, we replicate changes to multiple regions. This ensures high availability and improved redundancy of our objects.

The caveat to this is that replication is a background task, and the speed at which an object is replicated from one region to another can be affected by many external factors.

To solve this, when a change is received at a region it looks at the Last Modified timestamp of the metadata to determine if the change is new and needs to be applied or if the region has already seen a newer change. It will discard the change if it is old.

Want to try it out?

Make a global bucket with no egress fees

The Object Notification Hub

When object notifications are enabled for a bucket, we assign one region to be the object notification hub for that bucket. This region gets the important job of keeping track of all the changes. We create a special index which is very similar to a secondary index in that region's FoundationDB. We order the changes by FoundationDB Versionstamp, when the change is added to the index, and Last Modified timestamp of object metadata.

The Versionstamp helps the worker keep track of which events it has seen and processed.

Why one region you may ask? If we didn't do this, we end up with multiple regions sending the same events to the webhook, hello friendly DDos attack, or having to build a complex system to try and co-ordinate the regions so they don't send duplicate events.

The Background Task: Our Diligent Messenger

In our object notification region, we have a background task running. Think of it as a tireless worker that's always on the lookout for changes. Every so often, it checks the special index we mentioned earlier, collects all the latest changes, and sends them off to the webhook.

The worker will also keep track of the last processed change and will retry a few times if the request failed. Finally it will remove old changes from the index that have already been processed.

Why We Can't Guarantee Ordered Events

We talked about how object changes replicated from many regions can take different times. The problem arises when the worker is ready to send the latest events for an object. It has no way of knowing if all changes for an object have been replicated to its region. It could in theory contact every region and check, but this would be prohibitively expensive. And still not a complete guarantee.

This forces us to make the trade off of sending events out of order. The worker will read the latest list of changes that have been replicated to the region and send them to the webhook.

Wrapping Up

That's how we built object notifications in Tigris. We took a global system, added some global replication, threw in a change index, topped it off with a hardworking background task.

The result? A system that keeps you in the loop about what's happening in your buckets, no matter where in the world those changes occur. Whether you're building the next big photo-sharing app or just want to keep tabs on your storage, object notifications have got your back!

We hope this peek behind the scenes was fun and informative. Happy coding!

· 5 min read
Xe Iaso

Docker is the universal package format of the internet. When you deploy software to your computers, chances are you build your app into a container image and deploy it through either Docker or something that understands the same formats that Docker uses. However, this is where they get you: Docker image storage in the cloud is not free. Docker registries also have strict image size limits and will charge you egress fees based on the size of your images.

What if you could host your own registry though? What if when doing it you could actually get a better experience than you get with the hosted registries on the big cloud.

A sea of scattered clouds covers the land beneath.

A sea of scattered clouds covers the land beneath. Photo by Xe Iaso, iPhone 15 Pro Max @ 22mm.

· 3 min read
Annie Sexton

Tigris is a globally distributed S3-compatible object storage solution available that can easily be hosted on Fly.io. In this article, we'll explore how Tigris fits into the existing slate of object storage options and why you might choose one over the other.

You don't need a CDN

Probably the most exciting aspect of Tigris is its globally distributed nature. But what does that actually mean?

First, consider a common setup: you want to quickly deliver assets to users from your object storage, so typically you’d need to make use of a content delivery network (CDN) to cache your data in multiple regions, which helps reduce latency. When using Amazon S3, Cloudfront is the CDN most often used.

· 5 min read
Ovais Tariq

Tigris globally distributed object
storage [src: playground.com]

Eighteen years ago today, Amazon completely changed how developers work with data storage by giving us Simple Storage Service (S3).

S3 rewrote the rules of storage and propelled us into a new era of cloud computing. Traditional storage solutions were cumbersome and costly, and they shackled developers to the limitations of the hardware. With S3, Amazon introduced a shift towards Storage as a Service, liberating developers from the burdensome tasks of purchasing, provisioning, and managing physical storage. No longer were they bound by the precarious dance of capacity planning, where overestimating meant wasted resources and underestimating spelled disaster for uptime.

· 4 min read
Ovais Tariq

Hello, world! We're Tigris Data, and today we're announcing the public beta of Tigris. Tigris is a globally distributed object storage service that provides low latency anywhere in the world, enabling developers like you to store and access any amount of data using the S3 libraries you're already using in production. Today, we're launching our public beta on top of Fly.io.

Tigris globally distributed object
storage [Midjourney prompt: tiger face, illustrated in binary code, blue and white.]