Let’s say you have a html file called file.html and you want to replace “.jpg” to “.png” but only when in a href of anchor elements.
Example of input:
[html]<a href="alice.jpg">alice<a>
<a href="bob.jpg">bob<a>
something.jpg
href.jpg
<a href="example.com">alice.jpg</a>
<img src="href.jpg">
[/html]
Desired output:
[html]<a href="alice.png">alice<a>
<a href="bob.png">bob<a>
something.jpg
href.jpg
<a href="example.com">alice.jpg</a>
<img src="href.jpg">
[/html]
Notice that only the first two references to “.jpg” were changed to “.png”, the ondes in the href of the anchor.
You can use sed with regexes to achieve this.
[bash]
sed -i -E ‘s/(<a href=".*).jpg(")/\1.png\2/’ file.html
[/bash]
Where:
- -i for editing the files in-place
- -E to use a script
- s// substitute
- (<a href=”.*) group 1, the string ‘<a href=”‘ followed of any character zero or more times
- .jpg the .jpg we want to replace
- (“) group 2, only “
- \1.png\2 substitute with the same group 1 then .png then the group 2.