如何从字符串中提取数字？-IT科技

如何从字符串中提取数字？

2024-11-08 09:04:00

admin

原创

摘要：问题描述：我有一个包含路径的字符串string="toto.titi.12.tata.2.abc.def" 我只想从这个字符串中提取数字。提取第一个数字：tmp="${string#toto.titi.*.}" num1="${tmp%.tata*}"...

问题描述：

我有一个包含路径的字符串

string="toto.titi.12.tata.2.abc.def"

我只想从这个字符串中提取数字。

提取第一个数字：

tmp="${string#toto.titi.*.}"
num1="${tmp%.tata*}"

提取第二个数字：

tmp="${string#toto.titi.*.tata.*.}"
num2="${tmp%.abc.def}"

因此，要提取参数，我必须分两步进行。如何一步提取数字？

解决方案 1：

您可以使用tr删除所有非数字字符，如下所示：

echo toto.titi.12.tata.2.abc.def | tr -d -c 0-9

解决方案 2：

以下是一个简短的例子：

string="toto.titi.12.tata.2.abc.def"
id=$(echo "$string" | grep -o -E '[0-9]+')

echo $id // => output: 12 2

数字之间留有空格。希望这对您有帮助...

解决方案 3：

要提取所有单独的数字并通过管道每行打印一个数字单词 -

tr '
' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /
/g'

分解：

用空格替换所有换行符：`tr '
' ' '`
用空格替换所有非数字：sed -e 's/[^0-9]/ /g'
删除前导空格：-e 's/^ *//g'
删除尾随的空格：-e 's/ *$//g'
按顺序将空格压缩为 1 个空格：tr -s ' '
用换行符替换剩余的空格分隔符：`sed 's/ /
/g'`

例子：

echo -e " this 20 is 2sen
ten324ce 2 sort of" | tr '
' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed 's/ /
/g'

将会打印出来

解决方案 4：

参数扩展似乎是当务之急。

$ string="toto.titi.12.tata.2.abc.def"
$ read num1 num2 <<<${string//[^0-9]/ }
$ echo "$num1 / $num2"
12 / 2

这当然取决于的格式$string。但至少对于您提供的示例，它似乎有效。

这可能比 anubhava 的 awk 解决方案更优秀，后者需要子 shell。我也喜欢 chepner 的解决方案，但正则表达式比参数扩展“更重”（尽管显然更精确）。（请注意，在上面的表达式中，[^0-9]可能看起来像正则表达式原子，但实际上不是。）

您可以在 bash 手册页中阅读有关此形式或参数扩展的信息。请注意${string//this/that}（以及<<<）是 bashism，与传统的 Bourne 或 posix shell 不兼容。

解决方案 5：

将您的字符串转换为这样的数组：

$ str="toto.titi.12.tata.2.abc.def"
$ arr=( ${str//[!0-9]/ } )
$ echo "${arr[@]}"
12 2

解决方案 6：

如果您提供想要得到的确切输出，这个问题会更容易回答。如果您的意思是只想从字符串中取出数字，并删除其他所有内容，您可以这样做：

d@AirBox:~$ string="toto.titi.12.tata.2.abc.def"
d@AirBox:~$ echo "${string//[a-z,.]/}"
122

如果你澄清一点，我可能会提供更多帮助。

解决方案 7：

您也可以使用 sed：

echo "toto.titi.12.tata.2.abc.def" | sed 's/[0-9]*//g'

这里， sed 替换

任意数字（类[0-9]）
重复任意次数（*）
没有任何内容（第二个和第三个之间没有任何内容/），
并g代表全球。

输出将是：

toto.titi..tata..abc.def

解决方案 8：

使用正则表达式匹配：

string="toto.titi.12.tata.2.abc.def"
[[ $string =~ toto.titi.([0-9]+).tata.([0-9]+). ]]
# BASH_REMATCH[0] would be "toto.titi.12.tata.2.", the entire match
# Successive elements of the array correspond to the parenthesized
# subexpressions, in left-to-right order. (If there are nested parentheses,
# they are numbered in depth-first order.)
first_number=${BASH_REMATCH[1]}
second_number=${BASH_REMATCH[2]}

解决方案 9：

使用 awk：

arr=( $(echo $string | awk -F "." '{print $3, $5}') )
num1=${arr[0]}
num2=${arr[1]}

解决方案 10：

您好，还有另一种方法可以使用“cut”，

echo $string | cut -d'.' -f3,5 | tr '.' ' '

这将为您提供以下输出：12 2

解决方案 11：

修复换行符问题（针对 Mac 终端）：

cat temp.txt | tr '
' ' ' | sed -e 's/[^0-9]/ /g' -e 's/^ *//g' -e 's/ *$//g' | tr -s ' ' | sed $'s/ /\\
/g'

解决方案 12：

假设：

没有嵌入空白
文本字符串始终包含 7 个句点分隔的字符串
字符串始终在第 3 和第 5 个句点分隔的位置包含数字

一个bash不需要产生任何子进程的想法：

$ string="toto.titi.12.tata.2.abc.def"

$ IFS=. read -r x1 x2 num1 x3 num2 rest <<< "${string}"
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"

在评论中，OP 表示他们希望一次只提取一个数字；仍然可以使用相同的方法，例如：

$ string="toto.titi.12.tata.2.abc.def"

$ IFS=. read -r x1 x2 num1 rest <<< "${string}"
$ typeset -p num1
declare -- num1="12"

$ IFS=. read -r x1 x2 x3 x4 num2 rest <<< "${string}"
$ typeset -p num2
declare -- num2="2"

anubhava 答案的一个变体，它使用参数扩展而不是子进程调用awk，并且仍然使用同一组初始假设：

$ arr=( ${string//./ } )
$ num1=${arr[2]}
$ num2=${arr[4]}
$ typeset -p num1 num2
declare -- num1="12"
declare -- num2="2"