[python] sorting nested dictionary, json serializing, subscriptable, boto3

JunePyo Suh·2020년 10월 25일
0

Sort nested dictionary by value, and remainder by another value

Consider the following dictionary format.

{'KEY1':{'name':'google','date':20100701,'downloads':0},
 'KEY2':{'name':'chrome','date':20071010,'downloads':0},
 'KEY3':{'name':'python','date':20100710,'downloads':100}}

If we were to sort by downloads first, then all items with no downloads sorted by date, we can do so by using the key argument for sorted(). For the key argument we specify a function that returns a value that should be used to sort the items. If the returned value is a tuple, then it would sort by the first value, and then by the second value.

sorted(your_list, key=lambda x: (your_dict[x]['downloads'], your_dict[x]['date']))

Instead of using lambda, we can also simply pass a separate function.

def keyfunc(tup):
    key, d = tup
    return d["downloads"], d["date"]

items = sorted(d.items(), key = keyfunc)

TypeError when serializing class instance to JSON

The reason for TypeErrors when serializing class instances to JSON is that the JSON encoder json.dumps() only knows how to serialize a limited set of object types by default, all built-in types.

One solution could be writing a class that inherits from JSONEncoder and then implement the JSONEncoder.default() function.

Another simmple solution would be to call json.dumps() on the .__dict__ member of that instance.

class Foo(object):
    def __init__(self):
        self.x = 1
        self.y = 2

foo = Foo()
s = json.dumps(foo) # raises TypeError with "is not JSON serializable"

s = json.dumps(foo.__dict__) # s set to: {"x":1, "y":2}

"Subscriptable"

Subscriptable means that the object implements the __getitem__() method. In other words, it describes objects that are "containers." Types that are subscriptable include strings, lists, tuples, and dictionaries.

Boto3

Adopted from this medium article.

Reading from S3 using Boto

import boto3
import csv

// get a handle on s3
session = boto3.Session(
                    aws_access_key_id='XXXXXXXXXXXXXXXXXXXXXXX',
                    aws_secret_access_key='XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX',
                    region_name='XXXXXXXXXX')
                    
s3 = session.resource('s3')

// get a handle on the bucket that holds your file
bucket = s3.Bucket('bucket name') # example: energy_market_procesing

// get a handle on the object you want (i.e. your file)
obj = bucket.Object(key='file to read') # example: market/zone1/data.csv

// get the object
response = obj.get()

// read the contents of the file
lines = response['Body'].read()

// saving the file data in a new file test.csv
with open('test.csv', 'wb') as file:
    file.write(lines)

0개의 댓글