将常规 Python 字符串转换为原始字符串-IT科技

将常规 Python 字符串转换为原始字符串

2025-02-13 08:35:00

admin

原创

摘要：问题描述：我有一个字符串s，其内容是可变的。如何将其变为原始字符串？我正在寻找与该r''方法类似的方法。解决方案 1：原始字符串不是另一种类型的字符串。它们是在源代码中描述字符串的另一种方式。一旦创建了字符串，它就是原来的字符串。解决方案 2：我相信您正在寻找的是 str.encode("string-esc...

问题描述：

我有一个字符串s，其内容是可变的。如何将其变为原始字符串？我正在寻找与该r''方法类似的方法。

解决方案 1：

原始字符串不是另一种类型的字符串。它们是在源代码中描述字符串的另一种方式。一旦创建了字符串，它就是原来的字符串。

解决方案 2：

我相信您正在寻找的是 str.encode("string-escape") 函数。例如，如果您有一个要“原始字符串”的变量：

a = 'x89'
a.encode('unicode_escape').decode()
'\'

注意：适用string-escape于 Python 2.x 及更早版本

我正在寻找类似的解决方案，并通过以下方式找到了解决方案：
转换原始字符串 python

解决方案 3：

由于 Python 中的字符串是不可变的，因此您无法“改变”任何内容。但是，您可以从中创建一个新的原始字符串s，如下所示：

raw_s = r'{}'.format(s)

解决方案 4：

从 Python 3.6 开始，您可以使用以下命令（类似于@slashCoder）：

def to_raw(string):
    return fr"{string}"

my_dir ="C:dataprojects"
to_raw(my_dir)

产量'C:\data\projects'。我在 Windows 10 机器上使用它来将目录传递给函数。

解决方案 5：

原始字符串仅适用于字符串文字。它们存在的目的是让您可以更方便地表达将被转义序列处理修改的字符串。这在用字符串文字写出正则表达式或其他形式的代码时尤其有用。如果您想要一个没有转义处理的 unicode 字符串，只需在其前面加上ur，例如ur'somestring'。

解决方案 6：

对于 Python 3，执行此操作不添加双反斜杠并仅保留`
, `等的方法是：

a = 'hello
bobby
sally
'
a.encode('unicode-escape').decode().replace('\\\\', '\\')
print(a)

它给出一个可以写为 CSV 的值：

hello
bobby
sally

然而，对于其他特殊字符，似乎没有解决方案，这些字符前面可能会有一个 \。这很令人沮丧。解决这个问题会很复杂。

例如，要将包含特殊字符的字符串列表序列化为BERTpandas.Series期望格式的文本文件，每个句子之间有一个 CR，每个文档之间都有一个空行：

with open('sentences.csv', 'w') as f:

    current_idx = 0
    for idx, doc in sentences.items():
        # Insert a newline to separate documents
        if idx != current_idx:
            f.write('
')
        # Write each sentence exactly as it appared to one line each
        for sentence in doc:
            f.write(sentence.encode('unicode-escape').decode().replace('\\\\', '\\') + '
')

输出结果（针对 Github CodeSearchNet 中所有语言的文档字符串，已将其标记为句子）：

Makes sure the fast-path emits in order.
@param value the value to emit or queue up
@param delayError if true, errors are delayed until the source has terminated
@param disposable the resource to dispose if the drain terminates

Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends
a termination notification.
Scheduler:
{@code amb} does not operate by default on a particular {@link Scheduler}.
@param  the common element type
@param sources
an Iterable of ObservableSource sources competing to react first.
A subscription to each source will
occur in the same order as in the Iterable.
@return an Observable that emits the same sequence as whichever of the source ObservableSources first
emitted an item or sent a termination notification
@see ReactiveX operators documentation: Amb


...

解决方案 7：

格式如下：

s = "your string"; raw_s = r'{0}'.format(s)

解决方案 8：

s = "hel
lo"
raws = '%r'%s #coversion to raw string
#print(raws) will print 'hel
lo' with single quotes.
print(raws[1:-1]) # will print hel
lo without single quotes.
#raws[1:-1] string slicing is performed

解决方案 9：

我认为 repr 函数可以帮助你：

s = 't
'
repr(s)
"'t\\n'"
repr(s)[1:-1]
't\\n'

解决方案 10：

对我有用的解决方案是：

fr"{orignal_string}"

@ChemEnger 在评论中建议

解决方案 11：

稍微纠正一下@Jolly1234的回答：代码如下：

raw_string=path.encode('unicode_escape').decode()

解决方案 12：

只需简单使用编码函数即可。

my_var = 'hello'
my_var_bytes = my_var.encode()
print(my_var_bytes)

然后将其转换回常规字符串，执行以下操作

my_var_bytes = 'hello'
my_var = my_var_bytes.decode()
print(my_var)

- 编辑 -

下面的操作不会将字符串变为原始字符串，而是将其编码为字节，然后再解码。