Get List of Files in Specific AWS Bucket

Yuri Bondarenko

First, you have to be very careful when trying to get list of objects from AWS S3 bucket because potentially there might be millions or billions of files, so you might fall in situation where your pc will run out of memory or will stuck.

To control this in boto3 there is special paginator object that allows you to fetch data by so called pages.

This is the code in Python3 that retrieves all objects names (keys) and sizes from specific bucket.

import boto3

def get_list_of_objects(bucket):
    
    boto3.session.Session(profile_name='my-profile')
    conn = boto3.client('s3')
    
    paginator = conn.get_paginator('list_objects_v2')
    pages = paginator.paginate(Bucket=bucket)
    
    existing_objects = []
    
    for page in pages:
        for obj in page['Contents']:
            existing_objects.append((obj['Key'], obj['Size']))
    return existing_objects

print(get_list_of_objects(my-bucket))

Of course before running this code you need to set AWS credentials with AWS CLI.

Happy coding!

1 COMMENT

Get list of files and folders from specific Amazon S3 directory | CodeFlex 29/03/2021 At 21:41

[…] If you code in Python then read this post: Get List of Files in Specific AWS Bucket […]

1 COMMENT

LEAVE A REPLY Cancel reply