[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old bundles not being deleted from AWS #21

Open
ssvmvss opened this issue Jun 13, 2019 · 2 comments
Open

Old bundles not being deleted from AWS #21

ssvmvss opened this issue Jun 13, 2019 · 2 comments

Comments

@ssvmvss
Copy link
ssvmvss commented Jun 13, 2019

Purpose

Bundled files that are being uploaded by this are never getting removed from AWS taking a lot of space. Could we add some kind of worker that takes care of removing the ones that get old?

@djfarrelly
Copy link
Contributor

Hey @ssvmvss, I love the idea of cleaning up S3 for files that are unneeded! The hard part is that I'm not exactly sure how we should implement this. There are some aspects that this would have to handle:

  • During deploy, we'd need the last version of an asset until the new one is fully deployed
  • We rollback our code to a previous version - We would expect the static assets to still be uploaded
  • We have multiple dev branches with their own bundle versions, all need to work and could be in use for days/weeks depending on how long the branch is open.

We might be able to develop a new versioning or hashing URL structure to get around this then be able to delete things more easily. Here are some ideas:

  • Changing the file has to contain the release git hash version and re-upload all files on every single deploy: static.buffer.com/<project>/<git-hash>/path/to/file.js
    • ex. static.buffer.com/analyze/ 8b2836d/js/bundle.js
    • This could mean longer deploys for repos that have lots of images/files
  • Add a function to this cli as a "cleanup" function which allows someone to remove all put the last X (ex. 5) versions of a file being uploaded.
    • We'd have to determine the rules of this system that we would run and then implement some code that scanned the an entire directory in the s3 bucket then intelligently delete files
    • If we did this, the last 5 versions might be all dev versions, so we'd have to implement some logic that allows us to track what files were uploaded for a production release vs. a development release
    • We could use s3 object tagging to tag all files Environment=production and/or a Git release when the file was uploaded
  • We could log all file access on this bucket then write a script that deletes files that have not been accessed in 90 days. (idea). This could work, but would likely be a separate project that we could ask the Infra team to take on!

As for short-term actions, I've just added a Name: static.buffer.com tag to that S3 bucket to help track our usage in that bucket and see how much the storage is costing us! 😄

@ssvmvss, do you have any ideas on how we could implement what you're talking about in a simpler way? Any ideas from your previous experience doing things like this?

@djfarrelly
Copy link
Contributor

Cross referencing this thread:

The static.buffer.com bucket with our our bundles only cost $83 in August compared to our total cost of $21k for s3 that month.

While We should clean things up, this isn't a huge a cost so it might be worth the time right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants