get a text from html source

Discussion in 'PHP' started by Nuzhser, Jun 25, 2010.

  1. #1
    I get source of some files. Then i need to extract fragments of text from there. What functions do you recommend to use?
     
    Nuzhser, Jun 25, 2010 IP
  2. AsHinE

    AsHinE Well-Known Member

    Messages:
    240
    Likes Received:
    8
    Best Answers:
    1
    Trophy Points:
    138
    #2
    I guess some string functions or regexp or xpath.
    It depends what text (criteria) you have to extract.
    Please provide more details or example of source file and desired output.
     
    AsHinE, Jun 25, 2010 IP
  3. techbongo

    techbongo Active Member

    Messages:
    309
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    80
    #3
    explode() can work. you need to regular expressions very well in order to digg expected texts from variety of patterns.
     
    techbongo, Jun 25, 2010 IP
  4. MayurGondaliya

    MayurGondaliya Well-Known Member

    Messages:
    1,233
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    170
    #4
    MayurGondaliya, Jun 25, 2010 IP
  5. adamsinfo

    adamsinfo Greenhorn

    Messages:
    60
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    18
    #5
    If you want text from HTML you want to work with the HTML DOM. Use 'simplehtmldom'
     
    adamsinfo, Jun 25, 2010 IP
  6. w47w47

    w47w47 Peon

    Messages:
    255
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    ye... preg_match or preg_match_all is the way to go. ;)

    explode is faster but seperates by the same string.
     
    w47w47, Jun 25, 2010 IP
  7. Nuzhser

    Nuzhser Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    adamsinfo
    simplehtmldom is really usefull. Thamk you. Preg_match is less suite cause text changes every time
     
    Nuzhser, Jun 26, 2010 IP
  8. danx10

    danx10 Peon

    Messages:
    1,179
    Likes Received:
    44
    Best Answers:
    2
    Trophy Points:
    0
    #8
    the simplehtmldom uses regular expressions.
     
    danx10, Jun 26, 2010 IP
  9. Nuzhser

    Nuzhser Peon

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Oops, sorry.
    Could you suggest the way to get image paths from img tags in source?
     
    Nuzhser, Jun 26, 2010 IP