sed|awk extract multiple parts of one line

Discussion in 'Site & Server Administration' started by postcd, Jan 10, 2017.

  1. #1
    Hello,

    please is there anyone skilled enough to help me with this?

    In Linux i have file which have very long line containing multiple times following phrasse:

    <a href='/download/392931' class='download' .... some text here ... <a href='/download/366' class='download' ... another text ... <a href='/download/2' class='download'

    I need to extract only numbers that are after "/download/" phrasse or extract "/download/numberhere" phrasses only. The line contains other numbers so i need somehow extract only numbers after "download/"

    So the output should be numbers one per line or /download/numberhere one per line. :-S

    Maybe it can be done by somehow extracting string that is between two strings:
    <td class='download'> <a href='/download/392931' class='download'
    but the command would have to somehow do extraction multiple times per one line which makes it harder
     
    postcd, Jan 10, 2017 IP
  2. zacharooni

    zacharooni Well-Known Member

    Messages:
    346
    Likes Received:
    20
    Best Answers:
    4
    Trophy Points:
    120
    #2
    zach@sigma:~/Desktop$ bash test.sh | sort | uniq
    /download/2
    /download/366
    /download/392931
    zach@sigma:~/Desktop$ cat test.sh
    #!/bin/bash
    FILE=test
    grep -oP '/download/(\d+)' "$FILE"​
     
    zacharooni, Jan 18, 2017 IP
    sarahk likes this.