字符串查找替换
查找子串
BEGIN{
}
{
content = "That's a dog, in the room, who is go the google to search 101 dog.";
# 匹配成功则返回第一次匹配成功内容在字符串中的起始位置
print("index: "index(content, "dog"));
print("match: "match(content, "dog"));
# 找不到匹配项的时候返回0
print("index: "index(content, "man"));
print("match: "match(content, "man"));
# 查找标点符号
print("index: "index(content, /[^a-z,A-Z,0-1]/))
print("match: "match(content, /[^a-z,A-Z,0-1]/));
}
END{
}
$echo ""|awk -f chapter_3_5-1.awk
index: 10
match: 10
index: 0
match: 0
index: 60
match: 5
index()函数原型:
index(s, t)
s 待查找字符串
t 目标子串
返回第一次匹配成功的索引位置,失败时返回0
match 函数原型:
match(s, r [, a])
s 待查找字符串
r 查询的正则表达式 a 结果二维数组,可选参数,如果匹配成功保存第一个匹配到字符串的相关信息,a[0,"start"]: 首个匹配成功子串的开始位置,a[0,"length"]: 首个匹配成功子串的长度,a[0]: 首次匹配成功的子字符串,匹配失败时数组为空
返回第一次匹配成功的索引位置,RSTART设置成第一次匹配成功的索引位置,RLENGTH设置成匹配成功子串的长度(匹配失败为-1)
BEGIN{
}
{
content = "That's a dog, in the room, the dark room, who is go the google to search 101 dog.";
print(match(content, /[r,g]..[e,m]/, array));
# 第一个匹配的开始索引位置
print("RSTART: "RSTART);
# 第一个匹配的子串长度
print("RLENGTH: "RLENGTH);
print("");
for(key in array) {
# 结果数组是一个二维数组,存放第一个匹配的相关信息
len = split(key, keys, SUBSEP);
for (i=1; i<=len; ++i)="" {="" print("key["i"]:="" "="" keys[i]);="" }="" print("value:="" array[key]);="" if="" (="" rstart=""> 0 ) {
print("match string: " array[0]);
}
}
END{
}
=len;>
$echo ""|awk -f chapter_3_5-2.awk 22
RSTART: 22
RLENGTH: 4key[1]: 0
key[2]: start
value: 22
key[1]: 0
key[2]: length
value: 4
key[1]: 0
value: room
match string: room
match()可以看作是增强版的index(),支持正则表达式,并且返回的内容更多。
替换子串
BEGIN{
}
{
content = "That's a dog, in the room, the dark room, who is go the google to search 101 dog.";
# 只替换第一次匹配成功的子串
print("");
str = content;
num = sub("dog", "cat", str);
print("replace num: " num);
print("after replace: " str);
# 替换所有匹配成功的子串
print("");
str = content;
num = gsub("dog", "cat", str);
print("replace num: " num);
print("after replace: " str);
# 替换所有匹配成功子串或是第二次匹配到的子串
print("");
str = gensub("dog", "cat", "g", content);
print("content: " content);
print("str: " str);
print("");
str = gensub("dog", "cat", 2, content);
print("content: " content);
print("str: " str);
}
END{
}
$echo ""|awk -f chapter_3_5-3.awk
replace num: 1
after replace: That's a cat, in the room, the dark room, who is go the google to search 101 dog.replace num: 2
after replace: That's a cat, in the room, the dark room, who is go the google to search 101 cat.content: That's a dog, in the room, the dark room, who is go the google to search 101 dog.
str: That's a cat, in the room, the dark room, who is go the google to search 101 cat.content: That's a dog, in the room, the dark room, who is go the google to search 101 dog.
str: That's a dog, in the room, the dark room, who is go the google to search 101 cat.
sub()与gsub()参数用法完全相同,唯一的区别是sub()只替换首次匹配到的子串,gsub()替换所有匹配到的子串,而gensub()更为灵活,可以指定替换字串出现的位置或是全局替换,并且不会改变原字符串的值,函数原型如下:
sub(r, s [, t])
gsub(r, s [, t])
r 用于匹配的正则表达式
s 要替换的字符串值
t 目标字符串,可选参数,默认使用$0
返回,成功替换子串的数目gensub(r, s, h [, t])
r 用于匹配的正则表达式
s 要替换的字符串值
h "g"用来全局替换,或是用数字指定字串出现的位置 t 目标字符串,可选参数,默认使用$0