通过键列表访问嵌套的字典项？-IT科技

摘要：问题描述：我有一个复杂的字典结构，我想通过一个键列表来访问它以找到正确的项目。dataDict = { "a":{ "r": 1, "s": 2, "t": 3 ...

问题描述：

我有一个复杂的字典结构，我想通过一个键列表来访问它以找到正确的项目。

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
            },
        "w": 3
        }
    }    

maplist = ["a", "r"]

或者

maplist = ["b", "v", "y"]

我已经编写了以下可以运行的代码，但我相信如果有人有想法的话，还有更好、更有效的方法来做到这一点。

# Get a given data from a dictionary with position provided as a list
def getFromDict(dataDict, mapList):    
    for k in mapList:
        dataDict = dataDict[k]
    return dataDict

# Set a given data in a dictionary with position provided as a list
def setInDict(dataDict, mapList, value): 
    for k in mapList[:-1]:
        dataDict = dataDict[k]
    dataDict[mapList[-1]] = value

解决方案 1：

使用reduce()遍历字典：

from functools import reduce  # forward compatibility for Python 3
import operator

def getFromDict(dataDict, mapList):
    return reduce(operator.getitem, mapList, dataDict)

并重新使用getFromDict来查找存储值的位置setInDict()：

def setInDict(dataDict, mapList, value):
    getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value

除了最后一个元素之外的所有元素都mapList需要找到要添加值的“父”字典，然后使用最后一个元素将值设置为正确的键。

演示：

>>> getFromDict(dataDict, ["a", "r"])
1
>>> getFromDict(dataDict, ["b", "v", "y"])
2
>>> setInDict(dataDict, ["b", "v", "w"], 4)
>>> import pprint
>>> pprint.pprint(dataDict)
{'a': {'r': 1, 's': 2, 't': 3},
 'b': {'u': 1, 'v': {'w': 4, 'x': 1, 'y': 2, 'z': 3}, 'w': 3}}

请注意，Python PEP8 样式指南规定函数名称应使用蛇形命名法。上述方法同样适用于列表或字典和列表的混合，因此名称实际上应为get_by_path()和set_by_path()：

from functools import reduce  # forward compatibility for Python 3
import operator

def get_by_path(root, items):
    """Access a nested object in root by item sequence."""
    return reduce(operator.getitem, items, root)

def set_by_path(root, items, value):
    """Set a value in a nested object in root by item sequence."""
    get_by_path(root, items[:-1])[items[-1]] = value

为了完成起见，有一个删除键的函数：

def del_by_path(root, items):
    """Delete a key-value in a nested object in root by item sequence."""
    del get_by_path(root, items[:-1])[items[-1]]

解决方案 2：

使用循环似乎更符合 Python 风格。请参阅Python 3.0 中的新增功能中for的引文。

已删除reduce()。functools.reduce()如果确实需要，请使用它；但是，99% 的情况下，显式for循环更易读。

def nested_get(dic, keys):    
    for key in keys:
        dic = dic[key]
    return dic

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

def nested_del(dic, keys):
    for key in keys[:-1]:
        dic = dic[key]
    del dic[keys[-1]]

请注意，可接受的解决方案不会设置不存在的嵌套键（它会引发KeyError）。使用上述方法将创建不存在的节点。

该代码在 Python 2 和 3 中均可运行。

解决方案 3：

使用reduce很聪明，但如果父键不存在于嵌套字典中，OP的set方法可能会有问题。由于这是我在谷歌搜索中看到的第一篇关于这个主题的SO帖子，所以我想让它稍微好一点。

（给定索引和值列表，在嵌套的 Python 字典中设置值）中的 set 方法似乎对于缺少父键更为稳健。要将其复制过来：

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

此外，还有一个方便的方法，可以遍历密钥树并获取所有绝对密钥路径，为此我创建了：

def keysInDict(dataDict, parent=[]):
    if not isinstance(dataDict, dict):
        return [tuple(parent)]
    else:
        return reduce(list.__add__, 
            [keysInDict(v,parent+[k]) for k,v in dataDict.items()], [])

它的一个用途是将嵌套树转换为 pandas DataFrame，使用以下代码（假设嵌套字典中的所有叶子具有相同的深度）。

def dict_to_df(dataDict):
    ret = []
    for k in keysInDict(dataDict):
        v = np.array( getFromDict(dataDict, k), )
        v = pd.DataFrame(v)
        v.columns = pd.MultiIndex.from_product(list(k) + [v.columns])
        ret.append(v)
    return reduce(pd.DataFrame.join, ret)

解决方案 4：

这个库可能会有帮助：https：//github.com/akesterson/dpath-python

一个用于通过 /slashed/paths ala xpath 访问和搜索字典的 Python 库
基本上，它让你可以像文件系统一样遍历字典。

解决方案 5：

使用递归函数怎么样？

获取值：

def getFromDict(dataDict, maplist):
    first, rest = maplist[0], maplist[1:]

    if rest: 
        # if `rest` is not empty, run the function recursively
        return getFromDict(dataDict[first], rest)
    else:
        return dataDict[first]

并设置一个值：

def setInDict(dataDict, maplist, value):
    first, rest = maplist[0], maplist[1:]

    if rest:
        try:
            if not isinstance(dataDict[first], dict):
                # if the key is not a dict, then make it a dict
                dataDict[first] = {}
        except KeyError:
            # if key doesn't exist, create one
            dataDict[first] = {}

        setInDict(dataDict[first], rest, value)
    else:
        dataDict[first] = value

解决方案 6：

用递归解决了这个问题：

def get(d,l):
    if len(l)==1: return d[l[0]]
    return get(d[l[0]],l[1:])

使用你的例子：

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}
maplist1 = ["a", "r"]
maplist2 = ["b", "v", "y"]
print(get(dataDict, maplist1)) # 1
print(get(dataDict, maplist2)) # 2

解决方案 7：

从ndictsNestedDict包中检查（我是作者），它确实按照您的要求执行。

from ndicts import NestedDict

data_dict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}  

nd = NestedDict(data_dict)

您现在可以使用逗号分隔的值来访问键。

>>> nd["a", "r"]
    1
>>> nd["b", "v"]
    {"x": 1, "y": 2, "z": 3}

解决方案 8：

您可以使用 pydash：

import pydash as _
_.get(dataDict, ["b", "v", "y"], default='Default')

或者

import pydash 
data = {'a': {'b': {'c': [0, 0, {'d': [0, {1: 2}]}]}}}
pydash.get(data, 'a.b.c.2.d.1.[1]')  # ref https://pydash.readthedocs.io/en/latest/deeppath.html#deep-path-strings

https://pydash.readthedocs.io/en/latest/api.html

解决方案 9：

每次查找一个值时，性能都不会受到影响，你可以先将字典展平一次，然后简单地查找键，例如b:v:y

def flatten(mydict,sep = ':'):
  new_dict = {}
  for key,value in mydict.items():
    if isinstance(value,dict):
      _dict = {sep.join([key, _key]):_value for _key, _value in flatten(value).items()}
      new_dict.update(_dict)
    else:
      new_dict[key]=value
  return new_dict

dataDict = {
"a":{
    "r": 1,
    "s": 2,
    "t": 3
    },
"b":{
    "u": 1,
    "v": {
        "x": 1,
        "y": 2,
        "z": 3
    },
    "w": 3
    }
}    

flat_dict = flatten(dataDict)
print flat_dict
{'b:w': 3, 'b:u': 1, 'b:v:y': 2, 'b:v:x': 1, 'b:v:z': 3, 'a:r': 1, 'a:s': 2, 'a:t': 3}

这样，您只需使用即可查找项目flat_dict['b:v:y']，这将为您提供1。

并且，您不必在每次查找时遍历字典，而是可以通过展平字典并保存输出来加快速度，以便从冷启动进行查找意味着加载展平字典并仅执行无需遍历的键/值查找。

解决方案 10：

看到这些答案，对于设置和获取嵌套属性有两种静态方法，真是令人满意。这些解决方案比使用嵌套树好得多https://gist.github.com/hrldcpr/2012250

这是我的实现。

用法：

设置嵌套属性调用sattr(my_dict, 1, 2, 3, 5) is equal to my_dict[1][2][3][4]=5

获取嵌套属性调用gattr(my_dict, 1, 2)

def gattr(d, *attrs):
    """
    This method receives a dict and list of attributes to return the innermost value of the give dict       
    """
    try:
        for at in attrs:
            d = d[at]
        return d
    except(KeyError, TypeError):
        return None


def sattr(d, *attrs):
    """
    Adds "val" to dict in the hierarchy mentioned via *attrs
    For ex:
    sattr(animals, "cat", "leg","fingers", 4) is equivalent to animals["cat"]["leg"]["fingers"]=4
    This method creates necessary objects until it reaches the final depth
    This behaviour is also known as autovivification and plenty of implementation are around
    This implementation addresses the corner case of replacing existing primitives
    https://gist.github.com/hrldcpr/2012250#gistcomment-1779319
    """
    for attr in attrs[:-2]:
        if type(d.get(attr)) is not dict:
            d[attr] = {}
        d = d[attr]
    d[attrs[-2]] = attrs[-1]

解决方案 11：

纯 Python 风格，无需任何导入：

def nested_set(element, value, *keys):
    if type(element) is not dict:
        raise AttributeError('nested_set() expects dict as first argument.')
    if len(keys) < 2:
        raise AttributeError('nested_set() expects at least three arguments, not enough given.')

    _keys = keys[:-1]
    _element = element
    for key in _keys:
        _element = _element[key]
    _element[keys[-1]] = value

example = {"foo": { "bar": { "baz": "ok" } } }
keys = ['foo', 'bar']
nested_set(example, "yay", *keys)
print(example)

输出

{'foo': {'bar': 'yay'}}

解决方案 12：

如果您不想在某个键缺失时引发错误，则可以采用另一种方法（以便您的主代码可以不间断地运行）：

def get_value(self,your_dict,*keys):
    curr_dict_ = your_dict
    for k in keys:
        v = curr_dict.get(k,None)
        if v is None:
            break
        if isinstance(v,dict):
            curr_dict = v
    return v

在这种情况下，如果任何输入键不存在，则返回 None ，这可以用作主代码中的检查以执行替代任务。

解决方案 13：

参加聚会已经很晚了，但还是发帖以防万一，这可能会对将来的某人有所帮助。对于我的用例，以下函数效果最好。适用于从字典中提取任何数据类型

dict是包含我们值的字典

列表是实现我们价值的“步骤”列表

def getnestedvalue(dict, list):

    length = len(list)
    try:
        for depth, key in enumerate(list):
            if depth == length - 1:
                output = dict[key]
                return output
            dict = dict[key]
    except (KeyError, TypeError):
        return None

    return None

解决方案 14：

我宁愿使用简单的递归函数：

def get_value_by_path(data, maplist):
    if not maplist:
        return data
    for key in maplist:
        if key in data:
            return get_value_by_path(data[key], maplist[1:])

解决方案 15：

多用途且简单的函数，用于从嵌套字典或列表中获取字段值：

def key_chain(data, *args, default=None):
    for key in args:
        if isinstance(data, dict):
            data = data.get(key, default)
        elif isinstance(data, (list, tuple)) and isinstance(key, int):
            try:
                data = data[key]
            except IndexError:
                return default
        else:
            return default
    return data

如果缺少任何键，它将返回默认值，并支持列表和元组的整数键。对于您的情况，您可以这样调用它

key_chain(dataDict, *maplist)

或者

key_chain(dataDict, "b", "v", "y")

解决方案 16：

如果我错了，请纠正我，但这里的（许多）答案都没有处理您想要在未找到键时返回默认值的情况。此外，此函数处理您尝试搜索深度超过字典深度的情况。

def deep_get(d, keys, default=None):
    if keys:
        if isinstance(d, dict):
            return deep_get(d.get(keys[0], default), keys[1:], default)
        else:
            return default
    else:
        return d

# Tests
d = {'A': 1, 'B': {'a': 5, 'b': 6}}
assert deep_get(d, ['A']) == 1
assert deep_get(d, ['B', 'b']) == 6
assert deep_get(d, ['C']) is None
assert deep_get(d, ['C'], -1) == -1
assert deep_get(d, ['A', 'b'], -1) == -1
assert deep_get(d, ['B', 'a', 'b'], -1) == -1
assert deep_get({}, ['A'], -1) == -1
assert deep_get(None, ['A'], -1) == -1

解决方案 17：

如果您还希望能够处理任意 json （包括嵌套列表和字典），并很好地处理无效的查找路径，那么这是我的解决方案：

from functools import reduce


def get_furthest(s, path):
    '''
    Gets the furthest value along a given key path in a subscriptable structure.

    subscriptable, list -> any
    :param s: the subscriptable structure to examine
    :param path: the lookup path to follow
    :return: a tuple of the value at the furthest valid key, and whether the full path is valid
    '''

    def step_key(acc, key):
        s = acc[0]
        if isinstance(s, str):
            return (s, False)
        try:
            return (s[key], acc[1])
        except LookupError:
            return (s, False)

    return reduce(step_key, path, (s, True))


def get_val(s, path):
    val, successful = get_furthest(s, path)
    if successful:
        return val
    else:
        raise LookupError('Invalid lookup path: {}'.format(path))


def set_val(s, path, value):
    get_val(s, path[:-1])[path[-1]] = value

解决方案 18：

如何检查并设置字典元素而不处理所有索引两次？

解决方案：

def nested_yield(nested, keys_list):
    """
    Get current nested data by send(None) method. Allows change it to Value by calling send(Value) next time
    :param nested: list or dict of lists or dicts
    :param keys_list: list of indexes/keys
    """
    if not len(keys_list):  # assign to 1st level list
        if isinstance(nested, list):
            while True:
                nested[:] = yield nested
        else:
            raise IndexError('Only lists can take element without key')


    last_key = keys_list.pop()
    for key in keys_list:
        nested = nested[key]

    while True:
        try:
            nested[last_key] = yield nested[last_key]
        except IndexError as e:
            print('no index {} in {}'.format(last_key, nested))
            yield None

工作流程示例：

ny = nested_yield(nested_dict, nested_address)
data_element = ny.send(None)
if data_element:
    # process element
    ...
else:
    # extend/update nested data
    ny.send(new_data_element)
    ...
ny.close()

测试

>>> cfg= {'Options': [[1,[0]],[2,[4,[8,16]]],[3,[9]]]}
    ny = nested_yield(cfg, ['Options',1,1,1])
    ny.send(None)
[8, 16]
>>> ny.send('Hello!')
'Hello!'
>>> cfg
{'Options': [[1, [0]], [2, [4, 'Hello!']], [3, [9]]]}
>>> ny.close()

解决方案 19：

连接字符串的方法：

def get_sub_object_from_path(dict_name, map_list):
    for i in map_list:
        _string = "['%s']" % i
        dict_name += _string
    value = eval(dict_name)
    return value
#Sample:
_dict = {'new': 'person', 'time': {'for': 'one'}}
map_list = ['time', 'for']
print get_sub_object_from_path("_dict",map_list)
#Output:
#one

解决方案 20：

扩展@DomTomCat 和其他人的方法，这些功能（即，通过深度复制返回修改后的数据而不影响输入）setter 和 mapper 适用于嵌套dict和list。

设置者：

def set_at_path(data0, keys, value):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(set_by_path(v,keys[1:],value) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [set_by_path(x[1],keys[1:],value) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=value
        return data

映射器：

def map_at_path(data0, keys, f):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(map_at_path(v,keys[1:],f) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [map_at_path(x[1],keys[1:],f) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=f(data[keys[-1]])
        return data

解决方案 21：

我用这个

def get_dictionary_value(dictionary_temp, variable_dictionary_keys):
     try:
          if(len(variable_dictionary_keys) == 0):
               return str(dictionary_temp)

          variable_dictionary_key = variable_dictionary_keys[0]
          variable_dictionary_keys.remove(variable_dictionary_key)

          return get_dictionary_value(dictionary_temp[variable_dictionary_key] , variable_dictionary_keys)

     except Exception as variable_exception:
          logging.error(variable_exception)
 
          return ''

解决方案 22：

您可以使用evalpython中的函数。

def nested_parse(nest, map_list):
    nestq = "nest['" + "']['".join(map_list) + "']"
    return eval(nestq, {'__builtins__':None}, {'nest':nest})

解释

对于您的示例查询：maplist = ["b", "v", "y"]

nestq将是嵌套字典的"nest['b']['v']['y']"位置。nest

内置eval函数执行给定的字符串。但是，务必要小心使用该eval函数时可能出现的漏洞。讨论可在此处找到：

在nested_parse()函数中，我确保没有__builtins__可用的全局变量，并且唯一可用的局部变量是nest字典。