来自键路径的嵌套字典值
- 2025-03-25 08:47:00
- admin 原创
- 23
问题描述:
借助键路径从嵌套字典中获取值,如下所示dict
:
json = {
"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}
该方法的输入参数是用点分隔的键路径,从键路径 = “app.Garden.Flowers.white Flower” 需要打印“Jasmine”。到目前为止我的代码:
import json
with open('data.json') as data_file:
j = json.load(data_file)
def find(element, JSON):
paths = element.split(".")
# print JSON[paths[0]][paths[1]][paths[2]][paths[3]]
for i in range(0,len(paths)):
data = JSON[paths[i]]
# data = data[paths[i+1]]
print data
find('app.Garden.Flowers.White Flower',j)
解决方案 1:
这是fold的一个实例。你可以像这样简洁地编写它:
from functools import reduce
import operator
def find(element, json):
return reduce(operator.getitem, element.split('.'), json)
或者更符合 Python 风格(因为reduce()
可读性差而不受欢迎)如下:
def find(element, json):
keys = element.split('.')
rv = json
for key in keys:
rv = rv[key]
return rv
j = {"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}}
print find('app.Garden.Flowers.White Flower', j)
解决方案 2:
我建议您使用python-benedict
,这是一个具有完整键路径支持和许多实用方法的 python dict 子类。
您只需要投射您现有的字典:
d = benedict(json)
# now your keys support dotted keypaths
print(d['app.Garden.Flower.White Flower'])
这里是库和文档:
https://github.com/fabiocaccamo/python-benedict
注:我是这个项目的作者
解决方案 3:
选项 1:Cisco 的 pyats 库 [其 ac 扩展]
它快速且超级快(如果需要,可以用时间测量)
Javascript 式用法 [括号查找、点查找、组合查找]
缺少键的点式查找会引发属性错误,括号或默认的 python 字典查找会导致 KeyError。
pip install pyats pyats-datastructures pyats-utils
from pyats.datastructures import NestedAttrDict
item = {"specifications": {"os": {"value": "Android"}}}
path = "specifications.os.value"
x = NestedAttrDict(item)
print(x[path])# prints Android
print(x['specifications'].os.value)# prints Android
print(x['specifications']['os']['value'])#prints Android
print(x['specifications'].os.value1)# raises Attribute Error
选项 2:pyats.utils chainget
超级快(如果需要,可以用 timeit 测量)
from pyats.utils import utils
item = {"specifications": {"os": {"value": "Android"}}}
path = "specifications.os.value"
path1 = "specifications.os.value1"
print(utils.chainget(item,path))# prints android (string version)
print(utils.chainget(item,path.split('.')))# prints android(array version)
print(utils.chainget(item,path1))# raises KeyError
选项 3:不使用外部库的 Python
与 lambda 相比速度更快。
不需要像 lambda 和其他情况那样进行单独的错误处理。
可读且简洁,可以作为项目中的实用函数/助手
from functools import reduce
item = {"specifications": {"os": {"value": "Android"}}}
path1 = "specifications.family.value"
path2 = "specifications.family.value1"
def test1():
print(reduce(dict.get, path1.split('.'), item))
def test2():
print(reduce(dict.get, path2.split('.'), item))
test1() # prints Android
test2() # prints None
解决方案 4:
您的代码很大程度上依赖于键名中不出现点,您也许可以控制这一点,但不一定如此。
我将使用元素名称列表来寻求通用解决方案,然后通过拆分键名称的虚线列表来生成列表:
class ExtendedDict(dict):
"""changes a normal dict into one where you can hand a list
as first argument to .get() and it will do a recursive lookup
result = x.get(['a', 'b', 'c'], default_val)
"""
def multi_level_get(self, key, default=None):
if not isinstance(key, list):
return self.get(key, default)
# assume that the key is a list of recursively accessible dicts
def get_one_level(key_list, level, d):
if level >= len(key_list):
if level > len(key_list):
raise IndexError
return d[key_list[level-1]]
return get_one_level(key_list, level+1, d[key_list[level-1]])
try:
return get_one_level(key, 1, self)
except KeyError:
return default
get = multi_level_get # if you delete this, you can still use the multi_level-get
一旦你有了这个类,你就可以很容易地转换你的字典并得到“Jasmine”:
json = {
"app": {
"Garden": {
"Flowers": {
"Red flower": "Rose",
"White Flower": "Jasmine",
"Yellow Flower": "Marigold"
}
},
"Fruits": {
"Yellow fruit": "Mango",
"Green fruit": "Guava",
"White Flower": "groovy"
},
"Trees": {
"label": {
"Yellow fruit": "Pumpkin",
"White Flower": "Bogan"
}
}
}
}
j = ExtendedDict(json)
print j.get('app.Garden.Flowers.White Flower'.split('.'))
将会给你:
Jasmine
与普通get()
的字典一样,None
如果你指定的键(列表)在树中的任何地方都不存在,你就会得到结果,并且可以指定第二个参数作为返回值,而不是None
解决方案 5:
单行:
from functools import reduce
a = {"foo" : { "bar" : "blah" }}
path = "foo.bar"
reduce(lambda acc,i: acc[i], path.split('.'), a)
解决方案 6:
非常接近。您需要(如您在评论中所述)递归遍历主 JSON 对象。您可以通过存储最外层键/值的结果来实现这一点,然后使用该结果获取下一个键/值,依此类推,直到您没有路径为止。
def find(element, JSON):
paths = element.split(".")
data = JSON
for i in range(0,len(paths)):
data = data[paths[i]]
print data
但您仍然需要警惕 KeyErrors。
解决方案 7:
编写了与字典中的列表一起处理的函数。
d = {'test': [
{'value1': 'val'},
{'value1': 'val2'}]}
def find_element(keys: list, dictionary: dict):
rv = dictionary
if isinstance(dictionary, dict):
rv = find_element(keys[1:], rv[keys[0]])
elif isinstance(dictionary, list):
if keys[0].isnumeric():
rv = find_element(keys[1:], dictionary[int(keys[0])])
else:
return rv
return rv
val = find_element('test.1.value1'.split('.'), d)
解决方案 8:
添加一种使用 jsonpath-ng 的方法,以防有人发现它更适合他们的需要:
from jsonpath_ng import parse
parse('app.Garden.Flowers."White Flower"').find(json)[0].value
'Jasmine'
有很多使用 jsonpath-ng 使用 jsonpath 语法解析 json 的例子。
解决方案 9:
数据:
data = {
"data": {
"author_id": "1",
"text": "hi msg",
"attachments": {
"media_keys": [
"3_16"
]
},
"id": "2",
"edit_history_tweet_ids": [
"2"
]
},
"includes": {
"media": [
{
"media_key": "3_16",
"height": 500,
"type": "photo",
"width": 500,
"url": "https://pbs.twimg.com/media/xxxxxx.png"
}
],
"users": [
{
"id": "1",
"name": "name1",
"username": "username1"
}
]
}
}
功能:
def get_value_from_dict(dic_obj, keys: list, default):
"""
get value from dict with key path.
:param dic_obj: dict
:param keys: dict key
:param default: default value
:return:
"""
if not dic_obj or not keys:
return default
pre_obj = dic_obj
for key in keys:
t = type(pre_obj)
if t is dict:
pre_obj = pre_obj.get(key)
elif (t is list or t is tuple) and str(key).isdigit() and len(pre_obj) > int(key):
pre_obj = pre_obj[int(key)]
else:
return default
return pre_obj
测试:
print('media_key:', get_value_from_dict(data, 'data.attachments.media_keys'.split('.'), None))
print('username:', get_value_from_dict(data, 'includes.users.0.username'.split('.'), None))
media_key: ['3_16']
username: username1