Log in or Sign up

sed|awk extract multiple parts of one line

Discussion in 'Site & Server Administration' started by postcd, Jan 10, 2017.

postcd Well-Known Member

Messages:

1,043

Likes Received:

9

Best Answers:

1

Trophy Points:

190

#1

Hello,

please is there anyone skilled enough to help me with this?

In Linux i have file which have very long line containing multiple times following phrasse:

<a href='/download/392931' class='download' .... some text here ... <a href='/download/366' class='download' ... another text ... <a href='/download/2' class='download'

I need to extract only numbers that are after "/download/" phrasse or extract "/download/numberhere" phrasses only. The line contains other numbers so i need somehow extract only numbers after "download/"

So the output should be numbers one per line or /download/numberhere one per line. :-S

Maybe it can be done by somehow extracting string that is between two strings:
<td class='download'> <a href='/download/392931' class='download'
but the command would have to somehow do extraction multiple times per one line which makes it harder

postcd, Jan 10, 2017 IP
zacharooni Well-Known Member

Messages:

346

Likes Received:

20

Best Answers:

4

Trophy Points:

120

#2

zach@sigma:~/Desktop$ bash test.sh | sort | uniq
/download/2
/download/366
/download/392931
zach@sigma:~/Desktop$ cat test.sh
#!/bin/bash
FILE=test
grep -oP '/download/(\d+)' "$FILE"

zacharooni, Jan 18, 2017 IP

sarahk likes this.

(You must log in or sign up to reply here.)