Data cleaning without programming experience?

Discussion in 'Programming' started by darrens, Jun 7, 2012.

  1. #1
    Guys,

    I have a csv file with 700+ records. Each record has a about 20 fields.
    In 3/4 of the fields i have text that contains some basic html like <br /><b><p> etc ...
    Within this text i have just noticed there is lots of umlauts (special characters) ...
    I need to clean this content and replace the umlauts with the correct numeric value ...
    For example:
    &#172; = ¬

    I have no programming experience and use notepad++ to edit this text?
    I think there is just over 100 umlauts?

    To save myself a really boring job i need to find a way to 'find and replace' automatically?
    Any suggestions?

    Thanks
     
    darrens, Jun 7, 2012 IP
  2. spids

    spids Active Member

    Messages:
    222
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    58
    #2
    Use notepad++ it's greeat for editing files.

    search/find in files

    There you can enter anything and replace it with anything. Keep in mind you can also replace stuff with a "space"
     
    spids, Jun 8, 2012 IP
  3. Fortix Servers

    Fortix Servers Peon

    Messages:
    40
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Actually this is quite easy to code.
    You just need to make a look that search for a specific code and than replace it.
     
    Fortix Servers, Jun 10, 2012 IP
  4. Rukbat

    Rukbat Well-Known Member

    Messages:
    2,908
    Likes Received:
    37
    Best Answers:
    51
    Trophy Points:
    125
    #4
    If there are 100 different special codes (an umlaut is the 2 dots over a vowel used in some languages), you'll have to do 100 global search and replace operations if you want to do it without programming. Notepad++ is fine to use for this. (You could open the file in a web browser. If the page shows up right, highlight the whole page, copy and paste it into Notepad (which ignores almost everything but pure ASCII).
     
    Rukbat, Jun 16, 2012 IP
  5. Ev0Lv

    Ev0Lv Greenhorn

    Messages:
    43
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    16
    #5
    You can do it in any Text Editor i guess. With Search and Replace feature.
     
    Ev0Lv, Jun 17, 2012 IP